Getting started#
tsod is library for timeseries data. The format of a timeseries is always a Series
and in some cases with a DatetimeIndex
Get data in the form of a a
Series
(see Data formats below)Select one or more detectors e.g.
RangeDetector
orConstantValueDetector
Define parameters (e.g. min/max, max rate of change) or…
Fit parameters based on normal data, i.e. without outliers
Detect outliers in any dataset
Example#
>>> import pandas as pd
>>> from tsod import RangeDetector
>>> rd = RangeDetector(max_value=2.0)
>>> data = pd.Series([0.0, 1.0, 3.0]) # 3.0 is out of range i.e. an anomaly
>>> anom = rd.detect(data)
>>> anom
0 False
1 False
2 True
dtype: bool
>>> data[anom] # get anomalous data
2 3.0
dtype: float64
>>> data[~anom] # get normal data
0 0.0
1 1.0
dtype: float64
>>>
Saving and loading#
# save a configured detector
cd = CombinedDetector([ConstantValueDetector(), RangeDetector()])
cd.fit(normal_data)
cd.save("detector.joblib")
# ... and then later load it from disk
my_detector = tsod.load("detector.joblib")
my_detector.detect(some_data)
Data formats#
Converting data to a Series
import pandas as pd
df = pd.read_csv("mydata.csv", parse_dates=True, index_col=0)
my_series = df['water_level']
from mikeio import Dfs0
dfs = Dfs0('simple.dfs0')
df = dfs.to_dataframe()
my_series_2 = df['rainfall']