ComparerCollection
self, comparers) ComparerCollection(
Collection of comparers.
The ComparerCollection
is one of the main objects of the modelskill
package. It is a collection of Comparer
objects and created either by the match()
function, by passing a list of Comparers to the ComparerCollection
constructor, or by reading a config file using the from_config()
function.
NOTE: In case of multiple model results with different time coverage, only the overlapping time period will be used! (intersection)
Main functionality:
- selecting/filtering data
__get_item__()
- get a single Comparer, e.g.,cc[0]
orcc['obs1']
sel()
query()
- skill assessment
skill()
mean_skill()
gridded_skill()
(for track observations)
- plotting
- load/save/export data
Parameters
Name | Type | Description | Default |
---|---|---|---|
comparers | list of Comparer | list of comparers | required |
Examples
>>> import modelskill as ms
>>> mr = ms.DfsuModelResult("Oresund2D.dfsu", item=0)
>>> o1 = ms.PointObservation("klagshamn.dfs0", item=0, x=366844, y=6154291, name="Klagshamn")
>>> o2 = ms.PointObservation("drogden.dfs0", item=0, x=355568.0, y=6156863.0)
>>> cmp1 = ms.match(o1, mr) # Comparer
>>> cmp2 = ms.match(o2, mr) # Comparer
>>> ccA = ms.ComparerCollection([cmp1, cmp2])
>>> ccB = ms.match(obs=[o1, o2], mod=mr)
>>> sk = ccB.skill()
>>> ccB["Klagshamn"].plot.timeseries()
Methods
Name | Description |
---|---|
skill | Aggregated skill assessment of model(s) |
mean_skill | Weighted mean of skills |
gridded_skill | Skill assessment of model(s) on a regular spatial grid. |
score | Weighted mean score of model(s) over all observations |
rename | Rename observation, model or auxiliary data variables |
sel | Select data based on model, time and/or area. |
save | Save the ComparerCollection to a zip file. |
load | Load a ComparerCollection from a zip file. |
skill
=None, metrics=None, observed=False) ComparerCollection.skill(by
Aggregated skill assessment of model(s)
Parameters
Name | Type | Description | Default |
---|---|---|---|
by | str or List[str] | group by, by default [“model”, “observation”] - by column name - by temporal bin of the DateTimeIndex via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily - by the dt accessor of the DateTimeIndex (e.g. ‘dt.month’) using the syntax ‘dt:month’. The dt-argument is different from the freq-argument in that it gives month-of-year rather than month-of-data. - by attributes, stored in the cc.data.attrs container, e.g.: ‘attrs:obs_provider’ = group by observation provider or ‘attrs:gtype’ = group by geometry type (track or point) | None |
metrics | list | list of modelskill.metrics (or str), by default modelskill.options.metrics.list | None |
observed | bool | This only applies if any of the groupers are Categoricals. - True: only show observed values for categorical groupers. - False: show all values for categorical groupers. | False |
Returns
Name | Type | Description |
---|---|---|
SkillTable | skill assessment as a SkillTable object |
See also
sel a method for filtering/selecting data
Examples
>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mr)
>>> cc.skill().round(2)
n bias rmse urmse mae cc si r2
observation385 -0.20 0.35 0.29 0.25 0.97 0.09 0.99
HKNA 66 -0.08 0.22 0.20 0.18 0.97 0.07 0.99
EPL 113 -0.00 0.35 0.35 0.29 0.97 0.12 0.99 c2
>>> cc.sel(observation='c2', start='2017-10-28').skill().round(2)
n bias rmse urmse mae cc si r2
observation41 0.33 0.41 0.25 0.36 0.96 0.06 0.99 c2
>>> cc.skill(by='freq:D').round(2)
n bias rmse urmse mae cc si r22017-10-27 239 -0.15 0.25 0.21 0.20 0.72 0.10 0.98
2017-10-28 162 -0.07 0.19 0.18 0.16 0.96 0.06 1.00
2017-10-29 163 -0.21 0.52 0.47 0.42 0.79 0.11 0.99
mean_skill
=None, metrics=None, **kwargs) ComparerCollection.mean_skill(weights
Weighted mean of skills
First, the skill is calculated per observation, the weighted mean of the skills is then found.
Warning: This method is NOT the mean skill of all observational points! (mean_skill_points)
Parameters
Name | Type | Description | Default |
---|---|---|---|
weights | str or List(float) or Dict(str, float) | weighting of observations, by default None - None: use observations weight attribute (if assigned, else “equal”) - “equal”: giving all observations equal weight, - “points”: giving all points equal weight, - list of weights e.g. [0.3, 0.3, 0.4] per observation, - dictionary of observations with special weigths, others will be set to 1.0 | None |
metrics | list | list of modelskill.metrics, by default modelskill.options.metrics.list | None |
Returns
Name | Type | Description |
---|---|---|
SkillTable | mean skill assessment as a SkillTable object |
See also
skill skill assessment per observation mean_skill_points skill assessment pooling all observation points together
Examples
>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mod=HKZN_local)
>>> cc.mean_skill().round(2)
n bias rmse urmse mae cc si r2564 -0.09 0.31 0.28 0.24 0.97 0.09 0.99
HKZN_local >>> sk = cc.mean_skill(weights="equal")
>>> sk = cc.mean_skill(weights="points")
>>> sk = cc.mean_skill(weights={"EPL": 2.0}) # more weight on EPL, others=1.0
gridded_skill
ComparerCollection.gridded_skill(=5,
bins=None,
binsize=None,
by=None,
metrics=None,
n_min**kwargs,
)
Skill assessment of model(s) on a regular spatial grid.
Parameters
Name | Type | Description | Default |
---|---|---|---|
bins | int | criteria to bin x and y by, argument bins to pd.cut(), default 5 define different bins for x and y a tuple e.g.: bins = 5, bins = (5,[2,3,5]) | 5 |
binsize | float | bin size for x and y dimension, overwrites bins creates bins with reference to round(mean(x)), round(mean(y)) | None |
by | (str, List[str]) | group by, by default [“model”, “observation”] - by column name - by temporal bin of the DateTimeIndex via the freq-argument (using pandas pd.Grouper(freq)), e.g.: ‘freq:M’ = monthly; ‘freq:D’ daily - by the dt accessor of the DateTimeIndex (e.g. ‘dt.month’) using the syntax ‘dt:month’. The dt-argument is different from the freq-argument in that it gives month-of-year rather than month-of-data. | None |
metrics | list | list of modelskill.metrics, by default modelskill.options.metrics.list | None |
n_min | int | minimum number of observations in a grid cell; cells with fewer observations get a score of np.nan |
None |
Returns
Name | Type | Description |
---|---|---|
SkillGrid | skill assessment as a SkillGrid object |
See also
skill a method for aggregated skill assessment
Examples
>>> import modelskill as ms
>>> cc = ms.match([HKNA,EPL,c2], mr) # with satellite track measurements
>>> gs = cc.gridded_skill(metrics='bias')
>>> gs.data
<xarray.Dataset>
5, y: 5)
Dimensions: (x:
Coordinates:'alti'
observation * x (x) float64 -0.436 1.543 3.517 5.492 7.466
* y (y) float64 50.6 51.66 52.7 53.75 54.8
Data variables:3 0 0 14 37 17 50 36 72 ... 0 0 15 20 0 0 0 28 76
n (x, y) int32 -0.02626 nan nan ... nan 0.06785 -0.1143 bias (x, y) float64
>>> gs = cc.gridded_skill(binsize=0.5)
>>> gs.data.coords
Coordinates:'alti'
observation * x (x) float64 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5
* y (y) float64 51.5 52.5 53.5 54.5 55.5 56.5
score
=mtr.rmse, weights=None, **kwargs) ComparerCollection.score(metric
Weighted mean score of model(s) over all observations
Wrapping mean_skill() with a single metric.
NOTE: will take simple mean over different quantities!
Parameters
Name | Type | Description | Default |
---|---|---|---|
weights | str or List(float) or Dict(str, float) | weighting of observations, by default None - None: use observations weight attribute (if assigned, else “equal”) - “equal”: giving all observations equal weight, - “points”: giving all points equal weight, - list of weights e.g. [0.3, 0.3, 0.4] per observation, - dictionary of observations with special weigths, others will be set to 1.0 | None |
metric | list | a single metric from modelskill.metrics, by default rmse | mtr.rmse |
Returns
Name | Type | Description |
---|---|---|
Dict[str, float] | mean of skills score as a single number (for each model) |
See also
skill skill assessment per observation mean_skill weighted mean of skills assessment mean_skill_points skill assessment pooling all observation points together
Examples
>>> import modelskill as ms
>>> cc = ms.match([o1, o2], mod)
>>> cc.score()
'mod': 0.30681206}
{>>> cc.score(weights=[0.1,0.1,0.8])
'mod': 0.3383011631797379} {
>>> cc.score(weights='points', metric="mape")
'mod': 8.414442957854142} {
rename
ComparerCollection.rename(mapping)
Rename observation, model or auxiliary data variables
Parameters
Name | Type | Description | Default |
---|---|---|---|
mapping | dict | mapping of old names to new names | required |
Returns
Name | Type | Description |
---|---|---|
ComparerCollection |
Examples
>>> cc = ms.match([o1, o2], [mr1, mr2])
>>> cc.mod_names
'mr1', 'mr2']
[>>> cc2 = cc.rename({'mr1': 'model1'})
>>> cc2.mod_names
'model1', 'mr2'] [
sel
ComparerCollection.sel(=None,
model=None,
observation=None,
quantity=None,
start=None,
end=None,
time=None,
area**kwargs,
)
Select data based on model, time and/or area.
Parameters
Name | Type | Description | Default |
---|---|---|---|
model | str or int or list of str or list of int | Model name or index. If None, all models are selected. | None |
observation | str or int or list of str or list of int | Observation name or index. If None, all observations are selected. | None |
quantity | str or int or list of str or list of int | Quantity name or index. If None, all quantities are selected. | None |
start | str or datetime | Start time. If None, all times are selected. | None |
end | str or datetime | End time. If None, all times are selected. | None |
time | str or datetime | Time. If None, all times are selected. | None |
area | list of float | bbox: [x0, y0, x1, y1] or Polygon. If None, all areas are selected. | None |
**kwargs | Any | Filtering by comparer attrs similar to xarray.Dataset.filter_by_attrs e.g. sel(gtype='track') or sel(obs_provider='CMEMS') if at least one comparer has an entry obs_provider with value CMEMS in its attrs container. Multiple kwargs are combined with logical AND. |
{} |
Returns
Name | Type | Description |
---|---|---|
ComparerCollection | New ComparerCollection with selected data. |
save
ComparerCollection.save(filename)
Save the ComparerCollection to a zip file.
Each comparer is stored as a netcdf file in the zip file.
Parameters
Name | Type | Description | Default |
---|---|---|---|
filename | str or Path | Filename of the zip file. | required |
Examples
>>> cc = ms.match(obs, mod)
>>> cc.save("my_comparer_collection.msk")
load
ComparerCollection.load(filename)
Load a ComparerCollection from a zip file.
Parameters
Name | Type | Description | Default |
---|---|---|---|
filename | str or Path | Filename of the zip file. | required |
Returns
Name | Type | Description |
---|---|---|
ComparerCollection | The loaded ComparerCollection. |
Examples
>>> cc = ms.match(obs, mod)
>>> cc.save("my_comparer_collection.msk")
>>> cc2 = ms.ComparerCollection.load("my_comparer_collection.msk")