# Model skill assessment

## Simple comparison

Sometimes all your need is a simple comparison of two time series. The `modelskill.compare()` method does just that.

In [None]:
import mikeio
import modelskill as ms

### The model
Can be either a dfs0 or a DataFrame.  

In [None]:
fn_mod = 'data/SW/ts_storm_4.dfs0'
df_mod = mikeio.read(fn_mod, items=0).to_dataframe()

### The observation
Can be either a dfs0, a DataFrame or a PointObservation object. 

In [None]:
fn_obs = 'data/SW/eur_Hm0.dfs0'

### Match observation to model
The `match()` method will return an object that can be used for scatter plots, skill assessment, time series plots etc.

In [None]:
cmp = ms.match(fn_obs, df_mod)

In [None]:
cmp.plot.timeseries();

## Systematic vs random errors

![](images/systematic_random_error.png)

A model is an simplified version of a natural system, such as the ocean, and as such does not reflect every detail of the natural system.

In order to validate if a model does capture the essential dynamics of the natural system, it can be helpful to classify the mismatch of the model and observations in two broad categories:
* systematic errors
* random errors

A quantitativate assesment of a model involves calculating one or more model score, skill metrics, which in varying degrees capture systematic errors, random errors or a combination.

## Metrics

**Bias** is an indication of systematic error. In the left figure above, the model has negative bias (modelled wave heights are lower thatn observed). Thus it is an indication that the model can be improved.

**Root Mean Square Error** (rmse) is a combination of systematic and random error. It is a common metric to indicate the quality of a calibrated model, but less useful to understand the potential for further calibration since it captures both systematic and random errors.

**Unbiased Root Mean Square Error** (urmse) is the unbiased version of Root Mean Square Error. Since the bias is removed, it only captures the random error.

For a complete list of possible metrics, see the [Metrics section in the ModelSkill docs](https://dhi.github.io/modelskill/api/metrics/).



To get a quantitative model skill, we use the .skill() method, which returns a table (similar to a DataFrame).

In [None]:
cmp.skill()

The default is a number of common metrics, but you are free to pick your favorite metrics.

In [None]:
cmp.skill(metrics=["mae","rho","lin_slope"])

A very common way to visualize model skill is to use a scatter plot.

The scatter plot includes some additional features such as a 2d histogram, a Q-Q line and a regression line, but the appearance is highly configurable.

In [None]:
cmp.plot.scatter();

In [None]:
cmp.plot.scatter(binsize=0.5, 
          show_points=False,
          xlim=[0,6], ylim=[0,6],
          title="A calibrated model!");

## Taylor diagram

A taylor diagram is a way to combine several statistics in a single plot, and is very useful to compare the skill of several models, or observations in a single plot.

In [None]:
cmp.plot.taylor();

## Elaborate comparison

In [None]:
fn = 'data/SW/HKZN_local_2017_DutchCoast.dfsu'
mr = ms.model_result(fn, name='HKZN_local', item=0)
mr

In [None]:
o1 = ms.PointObservation('data/SW/HKNA_Hm0.dfs0', item=0, x=4.2420, y=52.6887, name="HKNA")
o2 = ms.PointObservation("data/SW/eur_Hm0.dfs0", item=0, x=3.2760, y=51.9990, name="EPL")

In [None]:
o1.plot.hist();

In [None]:
o1.plot(); 

### Overview

In [None]:
ms.plotting.spatial_overview(obs=[o1,o2], mod=mr);

In [None]:
cc = ms.match(obs=[o1,o2], mod=mr)
cc

In [None]:
cc.skill().style(precision=2)

In [None]:
cc["EPL"].skill(metrics="mean_absolute_error")

In [None]:
cc["HKNA"].plot.timeseries(figsize=(10,5));

In [None]:
cc["EPL"].plot.scatter(figsize=(8,8), show_hist=True);

In [None]:
cc["EPL"].plot.hist(bins=20);

In [None]:
cc["HKNA"].plot.scatter();