Is my model better than predicting the mean?

It is easy to be convinced that a model is good if it has a low error.

But it is always a good idea to compare your model to a baseline, to see if it is actually better than just predicting the mean.

This can be done easily in modelskill thanks to the DummyModelResult class.

import modelskill as ms

fn = '../data/Oresund2D.dfsu'
mr = ms.model_result(fn, item='Surface elevation')
mr
<DfsuModelResult>: Oresund2D
Time: 2018-03-04 00:00:00 - 2018-03-10 22:40:00
Quantity: Surface Elevation [m]
fn = '../data/smhi_2095_klagshamn.dfs0'
obs = ms.PointObservation(fn, x=366844.15, y=6154291.6, item=0)
obs
<PointObservation>: smhi_2095_klagshamn
Location: 366844.15, 6154291.6
Time: 2015-01-01 01:00:00 - 2020-09-28 00:00:00
Quantity: Water Level [m]
dmr = ms.DummyModelResult(data=0.0)
dmr
DummyModelResult(name='dummy', data=0.0, strategy='constant')
cmp = ms.match(obs=obs, mod=[mr, dmr]).remove_bias()
cmp.skill().round(3)
n bias rmse urmse mae cc si r2
model observation
Oresund2D smhi_2095_klagshamn 167 -0.0 0.041 0.041 0.033 0.84 0.378 0.704
dummy smhi_2095_klagshamn 167 -0.0 0.075 0.075 0.061 -0.00 0.695 0.000
cmp.skill().rmse.plot.barh(title="Better than predicting 0.0");

Above we created a DummyModelResult which always predicts 0.0.

But we can be even more lazy and just use the DummyModelResult with the mean strategy, which will predict the mean of the observed values.

dmr2 = ms.DummyModelResult(strategy='mean')
dmr2
DummyModelResult(name='dummy', data=None, strategy='mean')
cmp2 = ms.match(obs=obs, mod=[mr, dmr2]).remove_bias()
cmp2.skill().round(3)
n bias rmse urmse mae cc si r2
model observation
Oresund2D smhi_2095_klagshamn 167 -0.0 0.041 0.041 0.033 0.84 0.378 0.704
dummy smhi_2095_klagshamn 167 -0.0 0.075 0.075 0.061 0.00 0.695 0.000