Comparer

Comparer(self, matched_data, raw_mod_data=None)

Comparer class for comparing model and observation data.

The Comparer class is the main class of the ModelSkill package. It is returned by match(), from_matched() or as an element in a ComparerCollection. It holds the matched observation and model data for a single observation and has methods for plotting and skill assessment.

Main functionality:

Parameters

Name Type Description Default
matched_data xr.Dataset Matched data required
raw_mod_data dict of modelskill.TimeSeries Raw model data. If None, observation and modeldata must be provided. None

Examples

>>> import modelskill as ms
>>> cmp1 = ms.match(observation, modeldata)
>>> cmp2 = ms.from_matched(matched_data)

See Also

modelskill.match, modelskill.from_matched

Methods

Name Description
skill Skill assessment of model(s)
gridded_skill Aggregated spatial skill assessment of model(s) on a regular spatial grid.
score Model skill score
rename Rename observation, model or auxiliary data variables
sel Select data based on model, time and/or area.
where Return a new Comparer with values where cond is True
query Return a new Comparer with values where query cond is True
to_dataframe Convert matched data to pandas DataFrame
save Save to netcdf file
load Load from netcdf file

skill

Comparer.skill(by=None, metrics=None)

Skill assessment of model(s)

Parameters

Name Type Description Default
by str or List[str] group by, by default [“model”] - by column name - by temporal bin of the DateTimeIndex via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily - by the dt accessor of the DateTimeIndex (e.g. ‘dt.month’) using the syntax ‘dt:month’. The dt-argument is different from the freq-argument in that it gives month-of-year rather than month-of-data. None
metrics list list of modelskill.metrics, by default modelskill.options.metrics.list None

Returns

Name Type Description
SkillTable skill assessment object

See also

sel a method for filtering/selecting data

Examples

>>> import modelskill as ms
>>> cc = ms.match(c2, mod)
>>> cc['c2'].skill().round(2)
               n  bias  rmse  urmse   mae    cc    si    r2
observation
c2           113 -0.00  0.35   0.35  0.29  0.97  0.12  0.99
>>> cc['c2'].skill(by='freq:D').round(2)
             n  bias  rmse  urmse   mae    cc    si    r2
2017-10-27  72 -0.19  0.31   0.25  0.26  0.48  0.12  0.98
2017-10-28   0   NaN   NaN    NaN   NaN   NaN   NaN   NaN
2017-10-29  41  0.33  0.41   0.25  0.36  0.96  0.06  0.99

gridded_skill

Comparer.gridded_skill(
    bins=5,
    binsize=None,
    by=None,
    metrics=None,
    n_min=None,
    **kwargs,
)

Aggregated spatial skill assessment of model(s) on a regular spatial grid.

Parameters

Name Type Description Default
bins int criteria to bin x and y by, argument bins to pd.cut(), default 5 define different bins for x and y a tuple e.g.: bins = 5, bins = (5,[2,3,5]) 5
binsize float bin size for x and y dimension, overwrites bins creates bins with reference to round(mean(x)), round(mean(y)) None
by (str, List[str]) group by column name or by temporal bin via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily by default [“model”,“observation”] None
metrics list list of modelskill.metrics, by default modelskill.options.metrics.list None
n_min int minimum number of observations in a grid cell; cells with fewer observations get a score of np.nan None

Returns

Name Type Description
SkillGrid skill assessment as a SkillGrid object

See also

skill a method for aggregated skill assessment

Examples

>>> import modelskill as ms
>>> cmp = ms.match(c2, mod)   # satellite altimeter vs. model
>>> cmp.gridded_skill(metrics='bias')
<xarray.Dataset>
Dimensions:      (x: 5, y: 5)
Coordinates:
    observation   'alti'
* x            (x) float64 -0.436 1.543 3.517 5.492 7.466
* y            (y) float64 50.6 51.66 52.7 53.75 54.8
Data variables:
    n            (x, y) int32 3 0 0 14 37 17 50 36 72 ... 0 0 15 20 0 0 0 28 76
    bias         (x, y) float64 -0.02626 nan nan ... nan 0.06785 -0.1143
>>> gs = cc.gridded_skill(binsize=0.5)
>>> gs.data.coords
Coordinates:
    observation   'alti'
* x            (x) float64 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5
* y            (y) float64 51.5 52.5 53.5 54.5 55.5 56.5

score

Comparer.score(metric=mtr.rmse, **kwargs)

Model skill score

Parameters

Name Type Description Default
metric list a single metric from modelskill.metrics, by default rmse mtr.rmse

Returns

Name Type Description
dict[str, float] skill score as a single number (for each model)

See also

skill a method for skill assessment returning a pd.DataFrame

Examples

>>> import modelskill as ms
>>> cmp = ms.match(c2, mod)
>>> cmp.score()
{'mod': 0.3517964910888918}
>>> cmp.score(metric="mape")
{'mod': 11.567399646108198}

rename

Comparer.rename(mapping, errors='raise')

Rename observation, model or auxiliary data variables

Parameters

Name Type Description Default
mapping dict mapping of old names to new names required
errors ('raise', 'ignore') If ‘raise’, raise a KeyError if any of the old names do not exist in the data. By default ‘raise’. 'raise'

Returns

Name Type Description
Comparer

Examples

>>> cmp = ms.match(observation, modeldata)
>>> cmp.mod_names
['model1']
>>> cmp2 = cmp.rename({'model1': 'model2'})
>>> cmp2.mod_names
['model2']

sel

Comparer.sel(model=None, start=None, end=None, time=None, area=None)

Select data based on model, time and/or area.

Parameters

Name Type Description Default
model str or int or list of str or list of int Model name or index. If None, all models are selected. None
start str or datetime Start time. If None, all times are selected. None
end str or datetime End time. If None, all times are selected. None
time str or datetime Time. If None, all times are selected. None
area list of float bbox: [x0, y0, x1, y1] or Polygon. If None, all areas are selected. None

Returns

Name Type Description
Comparer New Comparer with selected data.

where

Comparer.where(cond)

Return a new Comparer with values where cond is True

Parameters

Name Type Description Default
cond (bool, np.ndarray, xr.DataArray) This selects the values to return. required

Returns

Name Type Description
Comparer New Comparer with values where cond is True and other otherwise.

Examples

>>> c2 = c.where(c.data.Observation > 0)

query

Comparer.query(query)

Return a new Comparer with values where query cond is True

Parameters

Name Type Description Default
query str Query string, see pandas.DataFrame.query required

Returns

Name Type Description
Comparer New Comparer with values where cond is True and other otherwise.

Examples

>>> c2 = c.query("Observation > 0")

to_dataframe

Comparer.to_dataframe()

Convert matched data to pandas DataFrame

Include x, y coordinates only if gtype=track

Returns

Name Type Description
pd.DataFrame data as a pandas DataFrame

save

Comparer.save(filename)

Save to netcdf file

Parameters

Name Type Description Default
filename str or Path filename required

load

Comparer.load(filename)

Load from netcdf file

Parameters

Name Type Description Default
filename str or Path filename required

Returns

Name Type Description
Comparer