Dfs0#

See Dfs0 in MIKE IO Documentation

import pandas as pd
import mikeio

Reading data#

ds = mikeio.read("data/TemporalEqTime.dfs0")
ds
<mikeio.Dataset>
dims: (time:10)
time: 1970-01-01 00:00:03 - 1970-01-01 00:01:33 (10 records)
geometry: GeometryUndefined()
items:
  0:  WaterLevel item <Water Level> (meter)
  1:  WaterDepth item <Water Depth> (meter)
type(ds)
mikeio.dataset._dataset.Dataset

The MIKE IO Dataset are used by all Dfs classes (Dfs0,Dfs1,Dfs2,Dfs3, Dfsu). A simple timeseries dataset can easily be converted to a Pandas DataFrame.

df = ds.to_dataframe() # convert dataset to dataframe
df
WaterLevel item WaterDepth item
1970-01-01 00:00:03 0.0 100.0
1970-01-01 00:00:13 1.0 101.0
1970-01-01 00:00:23 2.0 102.0
1970-01-01 00:00:33 3.0 103.0
1970-01-01 00:00:43 4.0 104.0
1970-01-01 00:00:53 5.0 105.0
1970-01-01 00:01:03 10.0 110.0
1970-01-01 00:01:13 11.0 111.0
1970-01-01 00:01:23 12.0 112.0
1970-01-01 00:01:33 13.0 113.0

Writing data#

df = pd.read_csv("data/naples_fl.csv", skiprows=1, parse_dates=True, index_col=0)
df
TAVG (Degrees Fahrenheit) TMAX (Degrees Fahrenheit) TMIN (Degrees Fahrenheit) PRCP (Inches) SNOW (Inches) SNWD (Inches)
Date
2002-03-01 67.0 78.0 56.0 0.00 NaN NaN
2002-03-02 76.0 83.0 69.0 0.00 NaN NaN
2002-03-03 78.0 84.0 71.0 0.00 NaN NaN
2002-03-04 64.0 76.0 51.0 0.08 NaN NaN
2002-03-05 58.0 70.0 45.0 0.00 NaN NaN
... ... ... ... ... ... ...
2021-08-11 NaN 93.0 77.0 0.23 NaN NaN
2021-08-12 NaN 94.0 77.0 0.00 0.0 0.0
2021-08-13 NaN 95.0 77.0 0.03 0.0 0.0
2021-08-14 NaN 85.0 74.0 0.05 0.0 0.0
2021-08-15 NaN 83.0 75.0 0.01 0.0 0.0

7108 rows × 6 columns

Writing a Dfs0 from a dataframe can be done like this (after importing mikeio).

df.to_dfs0("raw.dfs0")

You will probably have the need to parse certain a specific data formats many times, then it is a good idea to create a function.

def read_ncei_obs(filename):
    """Parse Meteo observations from NCEI"""
    
    sel_cols = ['temperature_avg_f','temperature_max_f','temperature_min_f', 'prec_in']
    df = (
        pd.read_csv("data/naples_fl.csv", skiprows=1, parse_dates=True, index_col=0)
           .rename(columns={'TAVG (Degrees Fahrenheit)': 'temperature_avg_f',
                            'TMAX (Degrees Fahrenheit)': 'temperature_max_f',
                            'TMIN (Degrees Fahrenheit)': 'temperature_min_f',
                            'PRCP (Inches)': 'prec_in'})
    )[sel_cols]
    return df
df = read_ncei_obs("data/naples_fl.csv")
df.head()
temperature_avg_f temperature_max_f temperature_min_f prec_in
Date
2002-03-01 67.0 78.0 56.0 0.00
2002-03-02 76.0 83.0 69.0 0.00
2002-03-03 78.0 84.0 71.0 0.00
2002-03-04 64.0 76.0 51.0 0.08
2002-03-05 58.0 70.0 45.0 0.00
df.tail()
temperature_avg_f temperature_max_f temperature_min_f prec_in
Date
2021-08-11 NaN 93.0 77.0 0.23
2021-08-12 NaN 94.0 77.0 0.00
2021-08-13 NaN 95.0 77.0 0.03
2021-08-14 NaN 85.0 74.0 0.05
2021-08-15 NaN 83.0 75.0 0.01
df.shape
(7108, 4)
df['temperature_max_c'] = (df['temperature_max_f'] - 32)/1.8
df['prec_mm'] = df['prec_in'] * 25.4
df.head()
temperature_avg_f temperature_max_f temperature_min_f prec_in temperature_max_c prec_mm
Date
2002-03-01 67.0 78.0 56.0 0.00 25.555556 0.000
2002-03-02 76.0 83.0 69.0 0.00 28.333333 0.000
2002-03-03 78.0 84.0 71.0 0.00 28.888889 0.000
2002-03-04 64.0 76.0 51.0 0.08 24.444444 2.032
2002-03-05 58.0 70.0 45.0 0.00 21.111111 0.000
df.loc['2021'].plot()
<Axes: xlabel='Date'>
_images/17f610326509e325a5887fab1ca9a141f487dd8953f72d653207986eb0f402fe.png

The simplest way to create a dfs0 file is to use the to_dfs0 method on a Pandas dataframe.

df.to_dfs0("output/naples_fl.dfs0")

Let’s read it back in again…

saved_ds = mikeio.read("output/naples_fl.dfs0")
saved_ds
<mikeio.Dataset>
dims: (time:7108)
time: 2002-03-01 00:00:00 - 2021-08-15 00:00:00 (7108 records)
geometry: GeometryUndefined()
items:
  0:  temperature_avg_f <Undefined> (undefined)
  1:  temperature_max_f <Undefined> (undefined)
  2:  temperature_min_f <Undefined> (undefined)
  3:  prec_in <Undefined> (undefined)
  4:  temperature_max_c <Undefined> (undefined)
  5:  prec_mm <Undefined> (undefined)

By default, EUM types are undefined. But it can be specified.

df2 = df[['temperature_max_c', 'prec_in']]
df2.head()
temperature_max_c prec_in
Date
2002-03-01 25.555556 0.00
2002-03-02 28.333333 0.00
2002-03-03 28.888889 0.00
2002-03-04 24.444444 0.08
2002-03-05 21.111111 0.00
from mikeio import ItemInfo, EUMType, EUMUnit

df2.to_dfs0("output/naples_fl_eum.dfs0",
            items=[
                   ItemInfo(EUMType.Temperature),
                   ItemInfo(EUMType.Precipitation_Rate, EUMUnit.inch_per_day)]
           )
mikeio.read("output/naples_fl_eum.dfs0")
<mikeio.Dataset>
dims: (time:7108)
time: 2002-03-01 00:00:00 - 2021-08-15 00:00:00 (7108 records)
geometry: GeometryUndefined()
items:
  0:  Temperature <Temperature> (degree Celsius)
  1:  Precipitation Rate <Precipitation Rate> (inch per day)

EUM#

from mikeio.eum import ItemInfo, EUMType, EUMUnit

EUMType.search("wind")
[Wind Velocity,
 Wind Direction,
 Wind friction factor,
 Wind speed,
 Depth of Wind,
 Wind friction speed]
EUMType.Wind_speed.units
[meter per sec, feet per sec, knot, km per hour, miles per hour]

Inline Exercise#

What is the best EUM Type for “peak wave direction”? What is the default unit?

# insert your code here

Precipitation data#

df = pd.read_csv("data/precipitation.csv", parse_dates=True, index_col=0)
df.head()
Precipitation station 1 Precipitation station 2 Precipitation station 3 Precipitation station 4 Precipitation station 5 Precipitation station 6 Precipitation station 7 Precipitation station 8 Precipitation station 9
date
2001-01-01 0.0 0.000 0.021 0.071 0.000 0.000 0.025 0.025 0.000
2001-01-02 0.0 0.025 0.037 0.000 0.004 0.054 0.042 0.021 0.054
2001-01-03 0.0 0.000 0.000 0.000 0.000 0.000 0.000 0.042 0.000
2001-01-04 0.0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
2001-01-05 0.0 0.000 0.158 0.021 0.000 0.000 0.017 0.021 0.000

Using a list comprehension is a compact way to manipulate data similar to using a for loop.

squares = [x**2 for x in range(10)]
squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
from mikecore.DfsFile import DataValueType

items = [ItemInfo(name, EUMType.Precipitation_Rate, EUMUnit.mm_per_hour, data_value_type=DataValueType.MeanStepBackward) for name in df.columns]

items
[Precipitation station 1 <Precipitation Rate> (mm per hour) - 3,
 Precipitation station 2 <Precipitation Rate> (mm per hour) - 3,
 Precipitation station 3 <Precipitation Rate> (mm per hour) - 3,
 Precipitation station 4 <Precipitation Rate> (mm per hour) - 3,
 Precipitation station 5 <Precipitation Rate> (mm per hour) - 3,
 Precipitation station 6 <Precipitation Rate> (mm per hour) - 3,
 Precipitation station 7 <Precipitation Rate> (mm per hour) - 3,
 Precipitation station 8 <Precipitation Rate> (mm per hour) - 3,
 Precipitation station 9 <Precipitation Rate> (mm per hour) - 3]
from string import ascii_uppercase

def create_prec_item(raw_name):
    """Create a item info with clean short name and correct EUM"""
    
    idx = int(raw_name[-1]) - 1
    
    name = (raw_name.replace("Precipitation ","")
                     .replace(" ", "_")
                     .capitalize()
                     .replace(raw_name[-1], ascii_uppercase[idx])
           )
    
    iteminfo = ItemInfo(name, EUMType.Precipitation_Rate, EUMUnit.mm_per_hour, data_value_type=DataValueType.MeanStepBackward)
    return iteminfo
    
create_prec_item("Precipitation station 9")
Station_I <Precipitation Rate> (mm per hour) - 3
items = [create_prec_item(name) for name in df.columns]

items
[Station_A <Precipitation Rate> (mm per hour) - 3,
 Station_B <Precipitation Rate> (mm per hour) - 3,
 Station_C <Precipitation Rate> (mm per hour) - 3,
 Station_D <Precipitation Rate> (mm per hour) - 3,
 Station_E <Precipitation Rate> (mm per hour) - 3,
 Station_F <Precipitation Rate> (mm per hour) - 3,
 Station_G <Precipitation Rate> (mm per hour) - 3,
 Station_H <Precipitation Rate> (mm per hour) - 3,
 Station_I <Precipitation Rate> (mm per hour) - 3]
items[0].data_value_type
<DataValueType.MeanStepBackward: 3>
df.to_dfs0("output/precipitation.dfs0", items=items)

Selecting items#

ds = mikeio.read("output/precipitation.dfs0", items=[1,4]) # select item by item number (starting from zero)
ds
<mikeio.Dataset>
dims: (time:31)
time: 2001-01-01 00:00:00 - 2001-01-31 00:00:00 (31 records)
geometry: GeometryUndefined()
items:
  0:  Station_B <Precipitation Rate> (mm per hour) - 3
  1:  Station_E <Precipitation Rate> (mm per hour) - 3
ds = mikeio.read("output/precipitation.dfs0", items=["Station_E","Station_B"]) # or by name (in the order you like it)
ds
<mikeio.Dataset>
dims: (time:31)
time: 2001-01-01 00:00:00 - 2001-01-31 00:00:00 (31 records)
geometry: GeometryUndefined()
items:
  0:  Station_E <Precipitation Rate> (mm per hour) - 3
  1:  Station_B <Precipitation Rate> (mm per hour) - 3

Inline Exercise#

Read all items to a variable ds. Select “Station_C” - which different ways can you select this item?

# insert your code here
import utils

utils.sysinfo()
    System: 3.11.9 (main, Apr  2 2024, 15:19:53) [GCC 11.4.0]
    NumPy: 1.26.4
    Pandas: 2.2.2
    MIKE IO: 1.7.1
    Last modified: 2024-04-11 17:10:46.829627