
ComparerCollection(self, comparers)

Collection of comparers.

The ComparerCollection is one of the main objects of the modelskill package. It is a collection of Comparer objects and created either by the match() function, by passing a list of Comparers to the ComparerCollection constructor, or by reading a config file using the from_config() function.

NOTE: In case of multiple model results with different time coverage, only the overlapping time period will be used! (intersection)

Main functionality:


Name Type Description Default
comparers list of Comparer list of comparers required


>>> import modelskill as ms
>>> mr = ms.DfsuModelResult("Oresund2D.dfsu", item=0)
>>> o1 = ms.PointObservation("klagshamn.dfs0", item=0, x=366844, y=6154291, name="Klagshamn")
>>> o2 = ms.PointObservation("drogden.dfs0", item=0, x=355568.0, y=6156863.0)
>>> cmp1 = ms.match(o1, mr)  # Comparer
>>> cmp2 = ms.match(o2, mr)  # Comparer
>>> ccA = ms.ComparerCollection([cmp1, cmp2])
>>> ccB = ms.match(obs=[o1, o2], mod=mr)
>>> sk = ccB.skill()
>>> ccB["Klagshamn"].plot.timeseries()


Name Description
skill Aggregated skill assessment of model(s)
mean_skill Weighted mean of skills
gridded_skill Skill assessment of model(s) on a regular spatial grid.
score Weighted mean score of model(s) over all observations
rename Rename observation, model or auxiliary data variables
sel Select data based on model, time and/or area.
query Select data based on a query.
save Save the ComparerCollection to a zip file.
load Load a ComparerCollection from a zip file.


ComparerCollection.skill(by=None, metrics=None, observed=False)

Aggregated skill assessment of model(s)


Name Type Description Default
by str or List[str] group by, by default [“model”, “observation”] - by column name - by temporal bin of the DateTimeIndex via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily - by the dt accessor of the DateTimeIndex (e.g. ‘dt.month’) using the syntax ‘dt:month’. The dt-argument is different from the freq-argument in that it gives month-of-year rather than month-of-data. - by attributes, stored in the container, e.g.: ‘attrs:obs_provider’ = group by observation provider or ‘attrs:gtype’ = group by geometry type (track or point) None
metrics list list of modelskill.metrics (or str), by default modelskill.options.metrics.list None
observed bool This only applies if any of the groupers are Categoricals. - True: only show observed values for categorical groupers. - False: show all values for categorical groupers. False


Name Type Description
SkillTable skill assessment as a SkillTable object

See also

sel a method for filtering/selecting data


>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mr)
>>> cc.skill().round(2)
               n  bias  rmse  urmse   mae    cc    si    r2
HKNA         385 -0.20  0.35   0.29  0.25  0.97  0.09  0.99
EPL           66 -0.08  0.22   0.20  0.18  0.97  0.07  0.99
c2           113 -0.00  0.35   0.35  0.29  0.97  0.12  0.99
>>> cc.sel(observation='c2', start='2017-10-28').skill().round(2)
               n  bias  rmse  urmse   mae    cc    si    r2
c2            41  0.33  0.41   0.25  0.36  0.96  0.06  0.99
>>> cc.skill(by='freq:D').round(2)
              n  bias  rmse  urmse   mae    cc    si    r2
2017-10-27  239 -0.15  0.25   0.21  0.20  0.72  0.10  0.98
2017-10-28  162 -0.07  0.19   0.18  0.16  0.96  0.06  1.00
2017-10-29  163 -0.21  0.52   0.47  0.42  0.79  0.11  0.99


ComparerCollection.mean_skill(weights=None, metrics=None, **kwargs)

Weighted mean of skills

First, the skill is calculated per observation, the weighted mean of the skills is then found.

Warning: This method is NOT the mean skill of all observational points! (mean_skill_points)


Name Type Description Default
weights str or List(float) or Dict(str, float) weighting of observations, by default None - None: use observations weight attribute (if assigned, else “equal”) - “equal”: giving all observations equal weight, - “points”: giving all points equal weight, - list of weights e.g. [0.3, 0.3, 0.4] per observation, - dictionary of observations with special weigths, others will be set to 1.0 None
metrics list list of modelskill.metrics, by default modelskill.options.metrics.list None


Name Type Description
SkillTable mean skill assessment as a SkillTable object

See also

skill skill assessment per observation mean_skill_points skill assessment pooling all observation points together


>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mod=HKZN_local)
>>> cc.mean_skill().round(2)
              n  bias  rmse  urmse   mae    cc    si    r2
HKZN_local  564 -0.09  0.31   0.28  0.24  0.97  0.09  0.99
>>> sk = cc.mean_skill(weights="equal")
>>> sk = cc.mean_skill(weights="points")
>>> sk = cc.mean_skill(weights={"EPL": 2.0}) # more weight on EPL, others=1.0



Skill assessment of model(s) on a regular spatial grid.


Name Type Description Default
bins int criteria to bin x and y by, argument bins to pd.cut(), default 5 define different bins for x and y a tuple e.g.: bins = 5, bins = (5,[2,3,5]) 5
binsize float bin size for x and y dimension, overwrites bins creates bins with reference to round(mean(x)), round(mean(y)) None
by (str, List[str]) group by, by default [“model”, “observation”] - by column name - by temporal bin of the DateTimeIndex via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily - by the dt accessor of the DateTimeIndex (e.g. ‘dt.month’) using the syntax ‘dt:month’. The dt-argument is different from the freq-argument in that it gives month-of-year rather than month-of-data. None
metrics list list of modelskill.metrics, by default modelskill.options.metrics.list None
n_min int minimum number of observations in a grid cell; cells with fewer observations get a score of np.nan None


Name Type Description
SkillGrid skill assessment as a SkillGrid object

See also

skill a method for aggregated skill assessment


>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mr)  # with satellite track measurements
>>> gs = cc.gridded_skill(metrics='bias')
Dimensions:      (x: 5, y: 5)
    observation   'alti'
* x            (x) float64 -0.436 1.543 3.517 5.492 7.466
* y            (y) float64 50.6 51.66 52.7 53.75 54.8
Data variables:
    n            (x, y) int32 3 0 0 14 37 17 50 36 72 ... 0 0 15 20 0 0 0 28 76
    bias         (x, y) float64 -0.02626 nan nan ... nan 0.06785 -0.1143
>>> gs = cc.gridded_skill(binsize=0.5)
    observation   'alti'
* x            (x) float64 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5
* y            (y) float64 51.5 52.5 53.5 54.5 55.5 56.5


ComparerCollection.score(metric=mtr.rmse, weights=None, **kwargs)

Weighted mean score of model(s) over all observations

Wrapping mean_skill() with a single metric.

NOTE: will take simple mean over different quantities!


Name Type Description Default
weights str or List(float) or Dict(str, float) weighting of observations, by default None - None: use observations weight attribute (if assigned, else “equal”) - “equal”: giving all observations equal weight, - “points”: giving all points equal weight, - list of weights e.g. [0.3, 0.3, 0.4] per observation, - dictionary of observations with special weigths, others will be set to 1.0 None
metric list a single metric from modelskill.metrics, by default rmse mtr.rmse


Name Type Description
Dict[str, float] mean of skills score as a single number (for each model)

See also

skill skill assessment per observation mean_skill weighted mean of skills assessment mean_skill_points skill assessment pooling all observation points together


>>> import modelskill as ms
>>> cc = ms.match([o1, o2], mod)
>>> cc.score()
{'mod': 0.30681206}
>>> cc.score(weights=[0.1,0.1,0.8])
{'mod': 0.3383011631797379}
>>> cc.score(weights='points', metric="mape")
{'mod': 8.414442957854142}



Rename observation, model or auxiliary data variables


Name Type Description Default
mapping dict mapping of old names to new names required


Name Type Description


>>> cc = ms.match([o1, o2], [mr1, mr2])
>>> cc.mod_names
['mr1', 'mr2']
>>> cc2 = cc.rename({'mr1': 'model1'})
>>> cc2.mod_names
['model1', 'mr2']



Select data based on model, time and/or area.


Name Type Description Default
model str or int or list of str or list of int Model name or index. If None, all models are selected. None
observation str or int or list of str or list of int Observation name or index. If None, all observations are selected. None
quantity str or int or list of str or list of int Quantity name or index. If None, all quantities are selected. None
start str or datetime Start time. If None, all times are selected. None
end str or datetime End time. If None, all times are selected. None
time str or datetime Time. If None, all times are selected. None
area list of float bbox: [x0, y0, x1, y1] or Polygon. If None, all areas are selected. None
**kwargs Any Filtering by comparer attrs similar to xarray.Dataset.filter_by_attrs e.g. sel(gtype='track') or sel(obs_provider='CMEMS') if at least one comparer has an entry obs_provider with value CMEMS in its attrs container. Multiple kwargs are combined with logical AND. {}


Name Type Description
ComparerCollection New ComparerCollection with selected data.



Select data based on a query.


Name Type Description Default
query str Query string. See pandas.DataFrame.query() for details. required


Name Type Description
ComparerCollection New ComparerCollection with selected data.


Save the ComparerCollection to a zip file.

Each comparer is stored as a netcdf file in the zip file.


Name Type Description Default
filename str or Path Filename of the zip file. required


>>> cc = ms.match(obs, mod)



Load a ComparerCollection from a zip file.


Name Type Description Default
filename str or Path Filename of the zip file. required


Name Type Description
ComparerCollection The loaded ComparerCollection.


>>> cc = ms.match(obs, mod)
>>> cc2 = ms.ComparerCollection.load("my_comparer_collection.msk")