The Dataset is the MIKE IO data structure for data from dfs files. The mikeio.read
methods returns a Dataset as a container of DataArray (Dfs items). Each DataArray has the properties, item , time , geometry and values . The time and geometry are common to all DataArrays in the Dataset.
The Dataset has the following primary properties:
items - a list of mikeio.ItemInfo
items for each dataarray
time - a pandas.DatetimeIndex
with the time instances of the data
geometry - a Geometry object with the spatial description of the data
Use Dataset’s string representation to get an overview of the Dataset
import mikeio
ds = mikeio.read("../data/HD2D.dfsu" )
ds
<mikeio.Dataset>
dims: (time:9, element:884)
time: 1985-08-06 07:00:00 - 1985-08-07 03:00:00 (9 records)
geometry: Dfsu2D (884 elements, 529 nodes)
items:
0: Surface elevation <Surface Elevation> (meter)
1: U velocity <u velocity component> (meter per sec)
2: V velocity <v velocity component> (meter per sec)
3: Current speed <Current Speed> (meter per sec)
Selecting items
Selecting a specific item “itemA” (at position 0) from a Dataset ds can be done with:
ds[["itemA"]]
- returns a new Dataset with “itemA”
ds["itemA"]
- returns “itemA” DataArray
ds[[0]]
- returns a new Dataset with “itemA”
ds[0]
- returns “itemA” DataArray
ds.itemA
- returns “itemA” DataArray
We recommend the use named items for readability.
<mikeio.DataArray>
name: Surface elevation
dims: (time:9, element:884)
time: 1985-08-06 07:00:00 - 1985-08-07 03:00:00 (9 records)
geometry: Dfsu2D (884 elements, 529 nodes)
Negative index e.g. ds[-1] can also be used to select from the end. Several items (“itemA” at 0 and “itemC” at 2) can be selected with the notation:
ds[["itemA", "itemC"]]
ds[[0, 2]]
Note that this behavior is similar to pandas and xarray.
Temporal selection
A time slice of a Dataset can be selected in several different ways.
ds.sel(time= "1985-08-06 12:00" )
<mikeio.Dataset>
dims: (element:884)
time: 1985-08-06 12:00:00 (time-invariant)
geometry: Dfsu2D (884 elements, 529 nodes)
items:
0: Surface elevation <Surface Elevation> (meter)
1: U velocity <u velocity component> (meter per sec)
2: V velocity <v velocity component> (meter per sec)
3: Current speed <Current Speed> (meter per sec)
<mikeio.Dataset>
dims: (time:2, element:884)
time: 1985-08-07 00:30:00 - 1985-08-07 03:00:00 (2 records)
geometry: Dfsu2D (884 elements, 529 nodes)
items:
0: Surface elevation <Surface Elevation> (meter)
1: U velocity <u velocity component> (meter per sec)
2: V velocity <v velocity component> (meter per sec)
3: Current speed <Current Speed> (meter per sec)
Spatial selection
The sel
method finds a single element.
ds.sel(x= 607002 , y= 6906734 )
<mikeio.Dataset>
dims: (time:9)
time: 1985-08-06 07:00:00 - 1985-08-07 03:00:00 (9 records)
geometry: GeometryPoint2D(x=607002.7094112666, y=6906734.833048992)
items:
0: Surface elevation <Surface Elevation> (meter)
1: U velocity <u velocity component> (meter per sec)
2: V velocity <v velocity component> (meter per sec)
3: Current speed <Current Speed> (meter per sec)
Plotting
In most cases, you will not plot the Dataset, but rather it’s DataArrays. But there are two exceptions:
dfs0-Dataset : plot all items as timeseries with ds.plot()
scatter : compare two items using ds.plot.scatter(x=“itemA”, y=“itemB”)
See details in the Dataset Plotter API .
Properties
The Dataset (and DataArray) has several properties:
n_items - Number of items
n_timesteps - Number of timesteps
n_elements - Number of elements
start_time - First time instance (as datetime)
end_time - Last time instance (as datetime)
is_equidistant - Is the time series equidistant in time
timestep - Time step in seconds (if is_equidistant)
shape - Shape of each item
deletevalue - File delete value (NaN value)
Methods
Dataset (and DataArray) has several useful methods for working with data, including different ways of selecting data:
sel()
- Select subset along an axis
isel()
- Select subset along an axis with an integer
Aggregations along an axis:
mean()
- Mean value along an axis
nanmean()
- Mean value along an axis (NaN removed)
max()
- Max value along an axis
nanmax()
- Max value along an axis (NaN removed)
min()
- Min value along an axis
nanmin()
- Min value along an axis (NaN removed)
average()
- Compute the weighted average along the specified axis.
aggregate()
- Aggregate along an axis
quantile()
- Quantiles along an axis
nanquantile()
- Quantiles along an axis (NaN ignored)
Mathematical operations
ds + value
ds - value
ds * value
and + and - between two Datasets (if number of items and shapes conform):
Other methods that also return a Dataset: