Comparer

Comparer(self, matched_data, raw_mod_data=None)

Comparer class for comparing model and observation data.

The Comparer class is the main class of the ModelSkill package. It is returned by match(), from_matched() or as an element in a ComparerCollection. It holds the matched observation and model data for a single observation and has methods for plotting and skill assessment.

Main functionality:

selecting/filtering data
- sel()
- query()
skill assessment
- skill()
- gridded_skill() (for track observations)
plotting
load/save/export data

Parameters

Name	Type	Description	Default
matched_data	xr.Dataset	Matched data	required
raw_mod_data	dict of modelskill.PointModelResult	Raw model data. If None, observation and modeldata must be provided.	`None`

Examples

>>> import modelskill as ms
>>> cmp1 = ms.match(observation, modeldata)
>>> cmp2 = ms.from_matched(matched_data)

Attributes

Name	Description
plot	Plot using the `ComparerPlotter`

Methods

Name	Description
skill	Skill assessment of model(s)
gridded_skill	Aggregated spatial skill assessment of model(s) on a regular spatial grid.
score	Model skill score
rename	Rename observation, model or auxiliary data variables
sel	Select data based on model, time and/or area.
where	Return a new Comparer with values where cond is True
query	Return a new Comparer with values where query cond is True
to_dataframe	Convert matched data to pandas DataFrame
save	Save to netcdf file
load	Load from netcdf file

skill

Comparer.skill(by=None, metrics=None)

Skill assessment of model(s)

Parameters

Name	Type	Description	Default
by	str or List[str]	group by, by default [“model”] - by column name - by temporal bin of the DateTimeIndex via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily - by the dt accessor of the DateTimeIndex (e.g. ‘dt.month’) using the syntax ‘dt:month’. The dt-argument is different from the freq-argument in that it gives month-of-year rather than month-of-data.	`None`
metrics	list	list of modelskill.metrics, by default modelskill.options.metrics.list	`None`

Returns

Name	Type	Description
	SkillTable	skill assessment object

Examples

>>> import modelskill as ms
>>> cc = ms.match(c2, mod)
>>> cc['c2'].skill().round(2)
               n  bias  rmse  urmse   mae    cc    si    r2
observation
c2           113 -0.00  0.35   0.35  0.29  0.97  0.12  0.99

>>> cc['c2'].skill(by='freq:D').round(2)
             n  bias  rmse  urmse   mae    cc    si    r2
2017-10-27  72 -0.19  0.31   0.25  0.26  0.48  0.12  0.98
2017-10-28   0   NaN   NaN    NaN   NaN   NaN   NaN   NaN
2017-10-29  41  0.33  0.41   0.25  0.36  0.96  0.06  0.99

gridded_skill

Comparer.gridded_skill(
    bins=5,
    binsize=None,
    by=None,
    metrics=None,
    n_min=None,
    **kwargs,
)

Aggregated spatial skill assessment of model(s) on a regular spatial grid.

Parameters

Name	Type	Description	Default
bins	int	criteria to bin x and y by, argument bins to pd.cut(), default 5 define different bins for x and y a tuple e.g.: bins = 5, bins = (5,[2,3,5])	`5`
binsize	float	bin size for x and y dimension, overwrites bins creates bins with reference to round(mean(x)), round(mean(y))	`None`
by	(str, List[str])	group by column name or by temporal bin via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily by default [“model”,“observation”]	`None`
metrics	list	list of modelskill.metrics, by default modelskill.options.metrics.list	`None`
n_min	int	minimum number of observations in a grid cell; cells with fewer observations get a score of `np.nan`	`None`

Returns

Name	Type	Description
	SkillGrid	skill assessment as a SkillGrid object

Examples

>>> import modelskill as ms
>>> cmp = ms.match(c2, mod)   # satellite altimeter vs. model
>>> cmp.gridded_skill(metrics='bias')
<xarray.Dataset>
Dimensions:      (x: 5, y: 5)
Coordinates:
    observation   'alti'
* x            (x) float64 -0.436 1.543 3.517 5.492 7.466
* y            (y) float64 50.6 51.66 52.7 53.75 54.8
Data variables:
    n            (x, y) int32 3 0 0 14 37 17 50 36 72 ... 0 0 15 20 0 0 0 28 76
    bias         (x, y) float64 -0.02626 nan nan ... nan 0.06785 -0.1143

>>> gs = cc.gridded_skill(binsize=0.5)
>>> gs.data.coords
Coordinates:
    observation   'alti'
* x            (x) float64 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5
* y            (y) float64 51.5 52.5 53.5 54.5 55.5 56.5

score

Comparer.score(metric=mtr.rmse, **kwargs)

Model skill score

Parameters

Name	Type	Description	Default
metric	list	a single metric from modelskill.metrics, by default rmse	`mtr.rmse`

Returns

Name	Type	Description
	dict[str, float]	skill score as a single number (for each model)

Examples

>>> import modelskill as ms
>>> cmp = ms.match(c2, mod)
>>> cmp.score()
{'mod': 0.3517964910888918}

>>> cmp.score(metric="mape")
{'mod': 11.567399646108198}

rename

Comparer.rename(mapping, errors='raise')

Rename observation, model or auxiliary data variables

Parameters

Name	Type	Description	Default
mapping	dict	mapping of old names to new names	required
errors	('raise', 'ignore')	If ‘raise’, raise a KeyError if any of the old names do not exist in the data. By default ‘raise’.	`'raise'`

Returns

Name	Type	Description
	Comparer

Examples

>>> cmp = ms.match(observation, modeldata)
>>> cmp.mod_names
['model1']
>>> cmp2 = cmp.rename({'model1': 'model2'})
>>> cmp2.mod_names
['model2']

sel

Comparer.sel(model=None, start=None, end=None, time=None, area=None)

Select data based on model, time and/or area.

Parameters

Name	Type	Description	Default
model	str or int or list of str or list of int	Model name or index. If None, all models are selected.	`None`
start	str or datetime	Start time. If None, all times are selected.	`None`
end	str or datetime	End time. If None, all times are selected.	`None`
time	str or datetime	Time. If None, all times are selected.	`None`
area	list of float	bbox: [x0, y0, x1, y1] or Polygon. If None, all areas are selected.	`None`

Returns

Name	Type	Description
	Comparer	New Comparer with selected data.

where

Comparer.where(cond)

Return a new Comparer with values where cond is True

Parameters

Name	Type	Description	Default
cond	(bool, np.ndarray, xr.DataArray)	This selects the values to return.	required

Returns

Name	Type	Description
	Comparer	New Comparer with values where cond is True and other otherwise.

Examples

>>> c2 = c.where(c.data.Observation > 0)

query

Comparer.query(query)

Return a new Comparer with values where query cond is True

Parameters

Name	Type	Description	Default
query	str	Query string, see pandas.DataFrame.query	required

Returns

Name	Type	Description
	Comparer	New Comparer with values where cond is True and other otherwise.

Examples

>>> c2 = c.query("Observation > 0")

to_dataframe

Comparer.to_dataframe()

Convert matched data to pandas DataFrame

Include x, y coordinates only if gtype=track

Returns

Name	Type	Description
	pd.DataFrame	data as a pandas DataFrame

save

Comparer.save(filename)

Save to netcdf file

Parameters

Name	Type	Description	Default
filename	str or Path	filename	required

load

Comparer.load(filename)

Load from netcdf file

Parameters

Name	Type	Description	Default
filename	str or Path	filename	required

Returns

Name	Type	Description
	Comparer

Comparer

Parameters

Examples

See Also

Attributes

Methods

skill

Parameters

Returns

See also

Examples

gridded_skill

Parameters

Returns

See also

Examples

score

Parameters

Returns

See also

Examples

rename

Parameters

Returns

Examples

sel

Parameters

Returns

where

Parameters

Returns

Examples

query

Parameters

Returns

Examples

to_dataframe

Returns

save

Parameters

load

Parameters

Returns