DataArray

DataArray(
    self,
    data,
    *,
    time=None,
    item=None,
    geometry=None,
    zn=None,
    dims=None,
    dt=1.0,
)

DataArray with data and metadata for a single item in a dfs file.

The DataArray has these main properties:

Examples

import pandas as pd
import mikeio

da = mikeio.DataArray([0.0, 1.0],
    time=pd.date_range("2020-01-01", periods=2),
    item=mikeio.ItemInfo("Water level", mikeio.EUMType.Water_Level))
da
<mikeio.DataArray>
name: Water level
dims: (time:2)
time: 2020-01-01 00:00:00 - 2020-01-02 00:00:00 (2 records)
geometry: GeometryUndefined()
values: [0, 1]

Attributes

Name Description
dtype Data-type of the array elements.
end_time Last time instance (as datetime).
is_equidistant Is DataArray equidistant in time?
n_timesteps Number of time steps.
name Name of this DataArray (=da.item.name).
ndim Number of array dimensions.
shape Tuple of array dimensions.
start_time First time instance (as datetime).
timestep Time step in seconds if equidistant (and at
type EUMType.
unit EUMUnit.
values Values as a np.ndarray (equivalent to to_numpy()).

Methods

Name Description
aggregate Aggregate along an axis.
average Compute the weighted average along the specified axis.
concat Concatenate DataArrays along the time axis.
copy Make copy of DataArray.
describe Generate descriptive statistics by wrapping pandas.DataFrame.describe.
dropna Remove time steps where values are NaN.
extract_track Extract data along a moving track.
flipud Flip upside down (on first non-time axis).
interp Interpolate data in time and space.
interp_like Interpolate in space (and in time) to other geometry (and time axis).
interp_na Fill in NaNs by interpolating according to different methods.
interp_time Temporal interpolation.
isel Return a new DataArray whose data is given by
max Max value along an axis.
mean Mean value along an axis.
min Min value along an axis.
nanmax Max value along an axis (NaN removed).
nanmean Mean value along an axis (NaN removed).
nanmin Min value along an axis (NaN removed).
nanquantile Compute the q-th quantile of the data along the specified axis, while ignoring nan values.
nanstd Standard deviation value along an axis (NaN removed).
ptp Range (max - min) a.k.a Peak to Peak along an axis.
quantile Compute the q-th quantile of the data along the specified axis.
sel Return a new DataArray whose data is given by
squeeze Remove axes of length 1.
std Standard deviation values along an axis.
to_dataframe Convert to DataFrame.
to_dfs Write data to a new dfs file.
to_numpy Values as a np.ndarray (equivalent to values).
to_pandas Convert to Pandas Series.
to_xarray Export to xarray.DataArray.

aggregate

DataArray.aggregate(axis=0, func=np.nanmean, **kwargs)

Aggregate along an axis.

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
func Callable[…, Any] default np.nanmean np.nanmean
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray dataarray with aggregated values

See Also

max : Max values
nanmax : Max values with NaN values removed

average

DataArray.average(weights, axis=0, **kwargs)

Compute the weighted average along the specified axis.

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0
weights np.ndarray weights to apply to the values required
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray DataArray with weighted average values

See Also

aggregate : Weighted average

Examples

>>> dfs = Dfsu("HD2D.dfsu")
>>> da = dfs.read(["Current speed"])[0]
>>> area = dfs.get_element_area()
>>> da2 = da.average(axis="space", weights=area)

concat

DataArray.concat(dataarrays, keep='last')

Concatenate DataArrays along the time axis.

Parameters

Name Type Description Default
dataarrays Sequence['DataArray'] DataArrays to concatenate required
keep Literal['last', 'first'] default: last 'last'

Returns

Name Type Description
DataArray The concatenated DataArray

Examples

da1 = mikeio.read("../data/HD2D.dfsu", time=[0,1])[0]
da2 = mikeio.read("../data/HD2D.dfsu", time=[2,3])[0]
da1.time
DatetimeIndex(['1985-08-06 07:00:00', '1985-08-06 09:30:00'], dtype='datetime64[ns]', freq=None)
da3 = mikeio.DataArray.concat([da1,da2])
da3
<mikeio.DataArray>
name: Surface elevation
dims: (time:4, element:884)
time: 1985-08-06 07:00:00 - 1985-08-06 14:30:00 (4 records)
geometry: Dfsu2D (884 elements, 529 nodes)

copy

DataArray.copy()

Make copy of DataArray.

describe

DataArray.describe(percentiles=None, include=None, exclude=None)

Generate descriptive statistics by wrapping pandas.DataFrame.describe.

Parameters

Name Type Description Default
percentiles list-like of numbers The percentiles to include in the output. All should fall between 0 and 1. None
include 'all', list-like of dtypes or None (default) A white list of data types to include in the result. None
exclude list-like of dtypes or None (default) A black list of data types to omit from the result. None

Returns

Name Type Description
pd.DataFrame

dropna

DataArray.dropna()

Remove time steps where values are NaN.

extract_track

DataArray.extract_track(track, method='nearest', dtype=np.float32)

Extract data along a moving track.

Parameters

Name Type Description Default
track pd.DataFrame with DatetimeIndex and (x, y) of track points as first two columns x,y coordinates must be in same coordinate system as dfsu required
track pd.DataFrame filename of csv or dfs0 file containing t,x,y required
method Literal['nearest', 'inverse_distance'] Spatial interpolation method (‘nearest’ or ‘inverse_distance’) default=‘nearest’ 'nearest'
dtype Any Data type of the output data, default=np.float32 np.float32

Returns

Name Type Description
Dataset A dataset with data dimension t The first two items will be x- and y- coordinates of track

flipud

DataArray.flipud()

Flip upside down (on first non-time axis).

interp

DataArray.interp(
    time=None,
    x=None,
    y=None,
    z=None,
    n_nearest=3,
    interpolant=None,
    **kwargs,
)

Interpolate data in time and space.

This method currently has limited functionality for spatial interpolation. It will be extended in the future.

The spatial parameters available depend on the geometry of the Dataset:

  • Grid1D: x
  • Grid2D: x, y
  • Grid3D: [not yet implemented!]
  • GeometryFM: (x,y)
  • GeometryFMLayered: (x,y) [surface point will be returned!]

Parameters

Name Type Description Default
time (float, pd.DatetimeIndex or DataArray) timestep in seconds or discrete time instances given by pd.DatetimeIndex (typically from another DataArray da2.time), by default None (=don’t interp in time) None
x float x-coordinate of point to be interpolated to, by default None None
y float y-coordinate of point to be interpolated to, by default None None
z float z-coordinate of point to be interpolated to, by default None None
n_nearest int When using IDW interpolation, how many nearest points should be used, by default: 3 3
interpolant tuple Precomputed interpolant, by default None None
**kwargs Any Additional keyword arguments to be passed to the interpolation {}

Returns

Name Type Description
DataArray new DataArray with interped data

See Also

sel : Select data using label indexing interp_like : Interp to another time/space of another DataArray interp_time : Interp in the time direction only

Examples

>>> da = mikeio.read("random.dfs1")[0]
>>> da.interp(time=3600)
>>> da.interp(x=110)
>>> da = mikeio.read("HD2D.dfsu").Salinity
>>> da.interp(x=340000, y=6160000)

interp_like

DataArray.interp_like(other, interpolant=None, **kwargs)

Interpolate in space (and in time) to other geometry (and time axis).

Note: currently only supports interpolation from dfsu-2d to dfs2 or other dfsu-2d DataArrays

Parameters

Name Type Description Default
other 'DataArray' | Grid2D | GeometryFM2D | pd.DatetimeIndex The target geometry (and time axis) to interpolate to required
interpolant tuple[Any, Any] | None Reuse pre-calculated index and weights None
**kwargs Any additional kwargs are passed to interpolation method {}

Examples

>>> dai = da.interp_like(da2)
>>> dae = da.interp_like(da2, extrapolate=True)
>>> dat = da.interp_like(da2.time)

Returns

Name Type Description
DataArray Interpolated DataArray

interp_na

DataArray.interp_na(axis='time', **kwargs)

Fill in NaNs by interpolating according to different methods.

Wrapper of xarray.DataArray.interpolate_na

Examples

import numpy as np
import pandas as pd
time = pd.date_range("2000", periods=3, freq="D")
da = mikeio.DataArray(data=np.array([0.0, np.nan, 2.0]), time=time)
da
<mikeio.DataArray>
name: NoName
dims: (time:3)
time: 2000-01-01 00:00:00 - 2000-01-03 00:00:00 (3 records)
geometry: GeometryUndefined()
values: [0, nan, 2]
da.interp_na()
<mikeio.DataArray>
name: NoName
dims: (time:3)
time: 2000-01-01 00:00:00 - 2000-01-03 00:00:00 (3 records)
geometry: GeometryUndefined()
values: [0, 1, 2]

interp_time

DataArray.interp_time(
    dt,
    *,
    method='linear',
    extrapolate=True,
    fill_value=np.nan,
)

Temporal interpolation.

Wrapper of scipy.interpolate.interp1d

Parameters

Name Type Description Default
dt float | pd.DatetimeIndex | 'DataArray' output timestep in seconds or new time axis required
method str Specifies the kind of interpolation as a string (‘linear’, ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘previous’, ‘next’, where ‘zero’, ‘slinear’, ‘quadratic’ and ‘cubic’ refer to a spline interpolation of zeroth, first, second or third order; ‘previous’ and ‘next’ simply return the previous or next value of the point) or as an integer specifying the order of the spline interpolator to use. Default is ‘linear’. 'linear'
extrapolate bool Default True. If False, a ValueError is raised any time interpolation is attempted on a value outside of the range of x (where extrapolation is necessary). If True, out of bounds values are assigned fill_value True
fill_value float Default NaN. this value will be used to fill in for points outside of the time range. np.nan

Returns

Name Type Description
DataArray

isel

DataArray.isel(idx=None, axis=0, **kwargs)

Return a new DataArray whose data is given by integer indexing along the specified dimension(s).

Note that the data will be a view of the original data if possible (single index or slice), otherwise a copy (fancy indexing) following NumPy convention.

The spatial parameters available depend on the dims (i.e. geometry) of the DataArray:

  • Grid1D: x
  • Grid2D: x, y
  • Grid3D: x, y, z
  • GeometryFM: element

Parameters

Name Type Description Default
idx int | Sequence[int] | slice | None Index, or indices, along the specified dimension(s) None
axis int | str axis number or “time”, by default 0 0
time int time index,by default None required
x int x index, by default None required
y int y index, by default None required
z int z index, by default None required
element int Bounding box of coordinates (left lower and right upper) to be selected, by default None required
**kwargs Any Not used {}

Returns

Name Type Description
DataArray new DataArray with selected data

See Also

dims : Get axis names sel : Select data using labels

Examples

da = mikeio.read("../data/europe_wind_long_lat.dfs2")[0]
da
<mikeio.DataArray>
name: Mean Sea Level Pressure
dims: (time:1, y:101, x:221)
time: 2012-01-01 00:00:00 (time-invariant)
geometry: Grid2D (ny=101, nx=221)
da.isel(time=-1)
<mikeio.DataArray>
name: Mean Sea Level Pressure
dims: (y:101, x:221)
time: 2012-01-01 00:00:00 (time-invariant)
geometry: Grid2D (ny=101, nx=221)
da.isel(x=slice(10,20), y=slice(40,60))
<mikeio.DataArray>
name: Mean Sea Level Pressure
dims: (time:1, y:20, x:10)
time: 2012-01-01 00:00:00 (time-invariant)
geometry: Grid2D (ny=20, nx=10)
da = mikeio.read("../data/oresund_sigma_z.dfsu").Temperature
da.isel(element=range(200))
<mikeio.DataArray>
name: Temperature
dims: (time:3, element:200)
time: 1997-09-15 21:00:00 - 1997-09-16 03:00:00 (3 records)
geometry: Flexible Mesh Geometry: Dfsu3DSigmaZ
number of nodes: 638
number of elements: 200
number of layers: 6
number of sigma layers: 4
projection: UTM-33

max

DataArray.max(axis=0, **kwargs)

Max value along an axis.

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with max values

See Also

nanmax : Max values with NaN values removed

mean

DataArray.mean(axis=0, **kwargs)

Mean value along an axis.

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with mean values

See Also

nanmean : Mean values with NaN values removed

min

DataArray.min(axis=0, **kwargs)

Min value along an axis.

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with min values

See Also

nanmin : Min values with NaN values removed

nanmax

DataArray.nanmax(axis=0, **kwargs)

Max value along an axis (NaN removed).

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with max values

See Also

nanmax : Max values with NaN values removed

nanmean

DataArray.nanmean(axis=0, **kwargs)

Mean value along an axis (NaN removed).

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with mean values

See Also

mean : Mean values

nanmin

DataArray.nanmin(axis=0, **kwargs)

Min value along an axis (NaN removed).

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with min values

See Also

nanmin : Min values with NaN values removed

nanquantile

DataArray.nanquantile(q, *, axis=0, **kwargs)

Compute the q-th quantile of the data along the specified axis, while ignoring nan values.

Wrapping np.nanquantile

Parameters

Name Type Description Default
q float | Sequence[float] Quantile or sequence of quantiles to compute, which must be between 0 and 1 inclusive. required
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray data with quantile values

Examples

>>> da.nanquantile(q=[0.25,0.75])
>>> da.nanquantile(q=0.5)
>>> da.nanquantile(q=[0.01,0.5,0.99], axis="space")

See Also

quantile : Quantile with NaN values

nanstd

DataArray.nanstd(axis=0, **kwargs)

Standard deviation value along an axis (NaN removed).

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with standard deviation values

See Also

std : Standard deviation

ptp

DataArray.ptp(axis=0, **kwargs)

Range (max - min) a.k.a Peak to Peak along an axis.

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with peak to peak values

quantile

DataArray.quantile(q, *, axis=0, **kwargs)

Compute the q-th quantile of the data along the specified axis.

Wrapping np.quantile

Parameters

Name Type Description Default
q float | Sequence[float] Quantile or sequence of quantiles to compute, which must be between 0 and 1 inclusive. required
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray data with quantile values

Examples

>>> da.quantile(q=[0.25,0.75])
>>> da.quantile(q=0.5)
>>> da.quantile(q=[0.01,0.5,0.99], axis="space")

See Also

nanquantile : quantile with NaN values ignored

sel

DataArray.sel(time=None, **kwargs)

Return a new DataArray whose data is given by selecting index labels along the specified dimension(s).

In contrast to DataArray.isel, indexers for this method should use labels instead of integers.

The spatial parameters available depend on the geometry of the DataArray:

  • Grid1D: x
  • Grid2D: x, y, coords, area
  • Grid3D: [not yet implemented! use isel instead]
  • GeometryFM: (x,y), coords, area
  • GeometryFMLayered: (x,y,z), coords, area, layers

Parameters

Name Type Description Default
time (str, pd.DatetimeIndex, DataArray) time labels e.g. “2018-01” or slice(“2018-1-1”,“2019-1-1”), by default None None
x float x-coordinate of point to be selected, by default None required
y float y-coordinate of point to be selected, by default None required
z float z-coordinate of point to be selected, by default None required
coords np.array(float, float) As an alternative to specifying x, y and z individually, the argument coords can be used instead. (x,y)- or (x,y,z)-coordinates of point to be selected, by default None required
area (float, float, float, float) Bounding box of coordinates (left lower and right upper) to be selected, by default None required
layers int or str or list layer(s) to be selected: “top”, “bottom” or layer number from bottom 0,1,2,… or from the top -1,-2,… or as list of these; only for layered dfsu, by default None required
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray new DataArray with selected data

See Also

isel : Select data using integer indexing interp : Interp data in time and space

Examples

da = mikeio.read("../data/random.dfs1")[0]
da
<mikeio.DataArray>
name: testing water level
dims: (time:100, x:3)
time: 2012-01-01 00:00:00 - 2012-01-01 00:19:48 (100 records)
geometry: Grid1D (n=3, dx=100)
da.sel(time=slice(None, "2012-1-1 00:02"))
<mikeio.DataArray>
name: testing water level
dims: (time:15, x:3)
time: 2012-01-01 00:00:00 - 2012-01-01 00:02:48 (15 records)
geometry: Grid1D (n=3, dx=100)
da.sel(x=100)
<mikeio.DataArray>
name: testing water level
dims: (time:100)
time: 2012-01-01 00:00:00 - 2012-01-01 00:19:48 (100 records)
geometry: GeometryUndefined()
values: [0.3231, 0.6315, ..., 0.7506]
da = mikeio.read("../data/oresund_sigma_z.dfsu").Temperature
da
<mikeio.DataArray>
name: Temperature
dims: (time:3, element:17118)
time: 1997-09-15 21:00:00 - 1997-09-16 03:00:00 (3 records)
geometry: Flexible Mesh Geometry: Dfsu3DSigmaZ
number of nodes: 12042
number of elements: 17118
number of layers: 9
number of sigma layers: 4
projection: UTM-33
da.sel(time="1997-09-15")
<mikeio.DataArray>
name: Temperature
dims: (element:17118)
time: 1997-09-15 21:00:00 (time-invariant)
geometry: Flexible Mesh Geometry: Dfsu3DSigmaZ
number of nodes: 12042
number of elements: 17118
number of layers: 9
number of sigma layers: 4
projection: UTM-33
values: [16.31, 16.43, ..., 16.69]
da.sel(x=340000, y=6160000, z=-3)
<mikeio.DataArray>
name: Temperature
dims: (time:3)
time: 1997-09-15 21:00:00 - 1997-09-16 03:00:00 (3 records)
geometry: GeometryPoint3D(x=340028.1116933554, y=6159980.070243686, z=-3.0)
values: [17.54, 17.31, 17.08]
da.sel(layers="bottom")
<mikeio.DataArray>
name: Temperature
dims: (time:3, element:3700)
time: 1997-09-15 21:00:00 - 1997-09-16 03:00:00 (3 records)
geometry: Dfsu2D (3700 elements, 2090 nodes)

squeeze

DataArray.squeeze()

Remove axes of length 1.

Returns

Name Type Description
DataArray

std

DataArray.std(axis=0, **kwargs)

Standard deviation values along an axis.

Parameters

Name Type Description Default
axis int | str axis number or “time” or “space”, by default 0 0
**kwargs Any Additional keyword arguments {}

Returns

Name Type Description
DataArray array with standard deviation values

See Also

nanstd : Standard deviation values with NaN values removed

to_dataframe

DataArray.to_dataframe(unit_in_name=False, round_time='ms')

Convert to DataFrame.

Parameters

Name Type Description Default
unit_in_name bool include unit in column name, default False, False
round_time str | bool round time to, by default “ms”, use False to avoid rounding 'ms'

Returns

Name Type Description
pd.DataFrame

to_dfs

DataArray.to_dfs(filename, **kwargs)

Write data to a new dfs file.

Parameters

Name Type Description Default
filename str | Path full path to the new dfs file required
dtype Dfs0 only: set the dfs data type of the written data to e.g. np.float64, by default: DfsSimpleType.Float (=np.float32) required
**kwargs Any Additional keyword arguments, e.g. dtype for dfs0 {}

to_numpy

DataArray.to_numpy()

Values as a np.ndarray (equivalent to values).

to_pandas

DataArray.to_pandas()

Convert to Pandas Series.

Returns

Name Type Description
pd.Series

to_xarray

DataArray.to_xarray()

Export to xarray.DataArray.