ComparerCollection

ComparerCollection(self, comparers)

Collection of comparers.

The ComparerCollection is one of the main objects of the modelskill package. It is a collection of Comparer objects and created either by the match() function, by passing a list of Comparers to the ComparerCollection constructor, or by reading a config file using the from_config() function.

NOTE: In case of multiple model results with different time coverage, only the overlapping time period will be used! (intersection)

Main functionality:

Parameters

Name Type Description Default
comparers list of Comparer list of comparers required

Examples

>>> import modelskill as ms
>>> mr = ms.DfsuModelResult("Oresund2D.dfsu", item=0)
>>> o1 = ms.PointObservation("klagshamn.dfs0", item=0, x=366844, y=6154291, name="Klagshamn")
>>> o2 = ms.PointObservation("drogden.dfs0", item=0, x=355568.0, y=6156863.0)
>>> cmp1 = ms.match(o1, mr)  # Comparer
>>> cmp2 = ms.match(o2, mr)  # Comparer
>>> ccA = ms.ComparerCollection([cmp1, cmp2])
>>> ccB = ms.match(obs=[o1, o2], mod=mr)
>>> sk = ccB.skill()
>>> ccB["Klagshamn"].plot.timeseries()

Methods

Name Description
skill Aggregated skill assessment of model(s)
mean_skill Weighted mean of skills
gridded_skill Skill assessment of model(s) on a regular spatial grid.
score Weighted mean score of model(s) over all observations
rename Rename observation, model or auxiliary data variables
sel Select data based on model, time and/or area.
save Save the ComparerCollection to a zip file.
load Load a ComparerCollection from a zip file.

skill

ComparerCollection.skill(by=None, metrics=None, observed=False)

Aggregated skill assessment of model(s)

Parameters

Name Type Description Default
by str or List[str] group by, by default [“model”, “observation”] - by column name - by temporal bin of the DateTimeIndex via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily - by the dt accessor of the DateTimeIndex (e.g. ‘dt.month’) using the syntax ‘dt:month’. The dt-argument is different from the freq-argument in that it gives month-of-year rather than month-of-data. - by attributes, stored in the cc.data.attrs container, e.g.: ‘attrs:obs_provider’ = group by observation provider or ‘attrs:gtype’ = group by geometry type (track or point) None
metrics list list of modelskill.metrics (or str), by default modelskill.options.metrics.list None
observed bool This only applies if any of the groupers are Categoricals. - True: only show observed values for categorical groupers. - False: show all values for categorical groupers. False

Returns

Name Type Description
SkillTable skill assessment as a SkillTable object

See also

sel a method for filtering/selecting data

Examples

>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mr)
>>> cc.skill().round(2)
               n  bias  rmse  urmse   mae    cc    si    r2
observation
HKNA         385 -0.20  0.35   0.29  0.25  0.97  0.09  0.99
EPL           66 -0.08  0.22   0.20  0.18  0.97  0.07  0.99
c2           113 -0.00  0.35   0.35  0.29  0.97  0.12  0.99
>>> cc.sel(observation='c2', start='2017-10-28').skill().round(2)
               n  bias  rmse  urmse   mae    cc    si    r2
observation
c2            41  0.33  0.41   0.25  0.36  0.96  0.06  0.99
>>> cc.skill(by='freq:D').round(2)
              n  bias  rmse  urmse   mae    cc    si    r2
2017-10-27  239 -0.15  0.25   0.21  0.20  0.72  0.10  0.98
2017-10-28  162 -0.07  0.19   0.18  0.16  0.96  0.06  1.00
2017-10-29  163 -0.21  0.52   0.47  0.42  0.79  0.11  0.99

mean_skill

ComparerCollection.mean_skill(weights=None, metrics=None, **kwargs)

Weighted mean of skills

First, the skill is calculated per observation, the weighted mean of the skills is then found.

Warning: This method is NOT the mean skill of all observational points! (mean_skill_points)

Parameters

Name Type Description Default
weights str or List(float) or Dict(str, float) weighting of observations, by default None - None: use observations weight attribute (if assigned, else “equal”) - “equal”: giving all observations equal weight, - “points”: giving all points equal weight, - list of weights e.g. [0.3, 0.3, 0.4] per observation, - dictionary of observations with special weigths, others will be set to 1.0 None
metrics list list of modelskill.metrics, by default modelskill.options.metrics.list None

Returns

Name Type Description
SkillTable mean skill assessment as a SkillTable object

See also

skill skill assessment per observation mean_skill_points skill assessment pooling all observation points together

Examples

>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mod=HKZN_local)
>>> cc.mean_skill().round(2)
              n  bias  rmse  urmse   mae    cc    si    r2
HKZN_local  564 -0.09  0.31   0.28  0.24  0.97  0.09  0.99
>>> sk = cc.mean_skill(weights="equal")
>>> sk = cc.mean_skill(weights="points")
>>> sk = cc.mean_skill(weights={"EPL": 2.0}) # more weight on EPL, others=1.0

gridded_skill

ComparerCollection.gridded_skill(
    bins=5,
    binsize=None,
    by=None,
    metrics=None,
    n_min=None,
    **kwargs,
)

Skill assessment of model(s) on a regular spatial grid.

Parameters

Name Type Description Default
bins int criteria to bin x and y by, argument bins to pd.cut(), default 5 define different bins for x and y a tuple e.g.: bins = 5, bins = (5,[2,3,5]) 5
binsize float bin size for x and y dimension, overwrites bins creates bins with reference to round(mean(x)), round(mean(y)) None
by (str, List[str]) group by, by default [“model”, “observation”] - by column name - by temporal bin of the DateTimeIndex via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily - by the dt accessor of the DateTimeIndex (e.g. ‘dt.month’) using the syntax ‘dt:month’. The dt-argument is different from the freq-argument in that it gives month-of-year rather than month-of-data. None
metrics list list of modelskill.metrics, by default modelskill.options.metrics.list None
n_min int minimum number of observations in a grid cell; cells with fewer observations get a score of np.nan None

Returns

Name Type Description
SkillGrid skill assessment as a SkillGrid object

See also

skill a method for aggregated skill assessment

Examples

>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mr)  # with satellite track measurements
>>> gs = cc.gridded_skill(metrics='bias')
>>> gs.data
<xarray.Dataset>
Dimensions:      (x: 5, y: 5)
Coordinates:
    observation   'alti'
* x            (x) float64 -0.436 1.543 3.517 5.492 7.466
* y            (y) float64 50.6 51.66 52.7 53.75 54.8
Data variables:
    n            (x, y) int32 3 0 0 14 37 17 50 36 72 ... 0 0 15 20 0 0 0 28 76
    bias         (x, y) float64 -0.02626 nan nan ... nan 0.06785 -0.1143
>>> gs = cc.gridded_skill(binsize=0.5)
>>> gs.data.coords
Coordinates:
    observation   'alti'
* x            (x) float64 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5
* y            (y) float64 51.5 52.5 53.5 54.5 55.5 56.5

score

ComparerCollection.score(metric=mtr.rmse, weights=None, **kwargs)

Weighted mean score of model(s) over all observations

Wrapping mean_skill() with a single metric.

NOTE: will take simple mean over different quantities!

Parameters

Name Type Description Default
weights str or List(float) or Dict(str, float) weighting of observations, by default None - None: use observations weight attribute (if assigned, else “equal”) - “equal”: giving all observations equal weight, - “points”: giving all points equal weight, - list of weights e.g. [0.3, 0.3, 0.4] per observation, - dictionary of observations with special weigths, others will be set to 1.0 None
metric list a single metric from modelskill.metrics, by default rmse mtr.rmse

Returns

Name Type Description
Dict[str, float] mean of skills score as a single number (for each model)

See also

skill skill assessment per observation mean_skill weighted mean of skills assessment mean_skill_points skill assessment pooling all observation points together

Examples

>>> import modelskill as ms
>>> cc = ms.match([o1, o2], mod)
>>> cc.score()
{'mod': 0.30681206}
>>> cc.score(weights=[0.1,0.1,0.8])
{'mod': 0.3383011631797379}
>>> cc.score(weights='points', metric="mape")
{'mod': 8.414442957854142}

rename

ComparerCollection.rename(mapping)

Rename observation, model or auxiliary data variables

Parameters

Name Type Description Default
mapping dict mapping of old names to new names required

Returns

Name Type Description
ComparerCollection

Examples

>>> cc = ms.match([o1, o2], [mr1, mr2])
>>> cc.mod_names
['mr1', 'mr2']
>>> cc2 = cc.rename({'mr1': 'model1'})
>>> cc2.mod_names
['model1', 'mr2']

sel

ComparerCollection.sel(
    model=None,
    observation=None,
    quantity=None,
    start=None,
    end=None,
    time=None,
    area=None,
    **kwargs,
)

Select data based on model, time and/or area.

Parameters

Name Type Description Default
model str or int or list of str or list of int Model name or index. If None, all models are selected. None
observation str or int or list of str or list of int Observation name or index. If None, all observations are selected. None
quantity str or int or list of str or list of int Quantity name or index. If None, all quantities are selected. None
start str or datetime Start time. If None, all times are selected. None
end str or datetime End time. If None, all times are selected. None
time str or datetime Time. If None, all times are selected. None
area list of float bbox: [x0, y0, x1, y1] or Polygon. If None, all areas are selected. None
**kwargs Any Filtering by comparer attrs similar to xarray.Dataset.filter_by_attrs e.g. sel(gtype='track') or sel(obs_provider='CMEMS') if at least one comparer has an entry obs_provider with value CMEMS in its attrs container. Multiple kwargs are combined with logical AND. {}

Returns

Name Type Description
ComparerCollection New ComparerCollection with selected data.

save

ComparerCollection.save(filename)

Save the ComparerCollection to a zip file.

Each comparer is stored as a netcdf file in the zip file.

Parameters

Name Type Description Default
filename str or Path Filename of the zip file. required

Examples

>>> cc = ms.match(obs, mod)
>>> cc.save("my_comparer_collection.msk")

load

ComparerCollection.load(filename)

Load a ComparerCollection from a zip file.

Parameters

Name Type Description Default
filename str or Path Filename of the zip file. required

Returns

Name Type Description
ComparerCollection The loaded ComparerCollection.

Examples

>>> cc = ms.match(obs, mod)
>>> cc.save("my_comparer_collection.msk")
>>> cc2 = ms.ComparerCollection.load("my_comparer_collection.msk")