ModelSkill assignment

ModelSkill assignment#

You are working on a project modelling waves in the Southern North Sea. You have done 6 different calibration runs and want to choose the “best”. You would also like to see how your best model is performing compared to a third-party model in NetCDF.

The data:

SW model results: 6 dfs0 files ts_runX.dfs0 each with 4 items corresponding to 4 stations
observations: 4 dfs0 files with station data for (name, longitude, latitude):
- F16: 4.0122, 54.1167
- HKZA: 4.0090, 52.3066
- K14: 3.6333, 53.2667
- L9: 4.9667, 53.6167
A map observations_map.png showing the model domain and observation positions
Third party model: 1 NetCDF file

The tasks:

Calibration - find the best run
Validation - compare model to third-party model

fldr = "../data/FMskill_assignment/"   # where have you put your data? 

import fmskill
from ModelSkill import PointObservation, ModelResult, Connector

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[2], line 1
----> 1 import fmskill
      2 from ModelSkill import PointObservation, ModelResult, Connector

ModuleNotFoundError: No module named 'fmskill'

1. Calibration#

1.1 Start simple: compare F16 with SW1 (the first calibration run)
1.2 Define all observations and all model results
1.3 Create connector, plot temporal coverage
1.4 Evaluate results
1.5 Which model is best?

1.1 Simple compare#

Use fmskill.compare to do a quick comparison of F16 and SW1.

What is the mean absolute error in cm? Do a time series plot.

1.2 Define all observations and all model results#

Define 4 PointObservations o1, o2, o3, o4
Define 6 ModelResults mr1, mr2, … (name them “SW1”, “SW2”, …)
How many items do the ModelResults have?

1.3 Create connector, plot temporal coverage#

Create empty Connector con
The add the connections one observation at a time (start by matching o1 with the 6 models, then o2…)
Print con to screen - which observation has most observation points?
Plot the temporal coverage of observations and models
Save the Connector to an excel configuration file

1.4 Evaluate results#

Do relevant qualitative and quantitative analysis (e.g. time series plots, scatter plots, skill tables etc) to compare the models.

1.5 Find the best#

Which calibration run is best?

Which model performs best in terms of bias?
Which model has the smallest scatter index?
Which model has linear slope closest to 1.0 for the station HKZA?
Consider the last day only (Nov 19) - which model has the smallest bias for that day?
Weighted: Give observation F16 10-times more weight than the other observations - which has the smallest MAE?
Extremes: Which model has lowest rmse for Hs>4.0 (df = cc.all_df[cc.all_df.obs_val>4])?

2. Validation#

We will now compare our best model against the UK MetOffice’s North West Shelf model stored in NWS_HM0.nc.

2.1 Create a ModelResult mr_NWS, evaluate mr_NWS.ds
2.2 Plot the first time step (hint .isel(time=0)) of ds (hint: the item is called “VHM0”)
2.3 Create a Connector con_NWS with the 4 observations and mr_NWS
2.4 Evaluate NWS - what is the mean rmse?
2.5 Compare NWS to SW5 - which model is better? And is it so for all stations and all metrics? (hint: you can merge ComparisonCollections using the + operator)