API Reference#

Generic#

class tsod.RangeDetector(min_value=- inf, max_value=inf, quantiles=None)#

Detect values outside range.

Parameters
  • min_value (float) – Minimum value threshold.

  • max_value (float) – Maximum value threshold.

  • quantiles (list[2]) – Default quantiles [0, 1]. Same as min and max value.

Examples

>>> normal_data = pd.Series(np.random.normal(size=100))
>>> abnormal_data = pd.Series(np.random.normal(size=100))
>>> abnormal_data[[2, 6, 15, 57, 60, 73]] = 5
>>> normal_data_with_some_outliers = pd.Series(np.random.normal(size=100))
>>> normal_data_with_some_outliers[[12, 13, 20, 90]] = 7
>>> detector = RangeDetector(min_value=0.0, max_value=2.0)
>>> anomalies = detector.detect(abnormal_data)
>>> detector = RangeDetector()
>>> detector.fit(normal_data) # min, max inferred from normal data
>>> anomalies = detector.detect(abnormal_data)
>>> detector = RangeDetector(quantiles=[0.001,0.999])
>>> detector.fit(normal_data_with_some_outliers)
>>> anomalies = detector.detect(abnormal_data)
detect(data: pandas.core.series.Series) pandas.core.series.Series#

Detect anomalies

Parameters

data (pd.Series) – Time series data with possible anomalies

Returns

Time series with bools, True == anomaly

Return type

pd.Series

fit(data: pandas.core.series.Series)#

Set detector parameters based on data.

Parameters

data (pd.Series) – Normal time series data.

save(path: Union[str, pathlib.Path]) None#

Save a detector for later use

Parameters

path (str or Path) – file-like object to load detector from

validate(data: Union[pandas.core.series.Series, pandas.core.frame.DataFrame]) Union[pandas.core.series.Series, pandas.core.frame.DataFrame]#

Check that input data is in correct format and possibly adjust

class tsod.ConstantValueDetector(window_size: int = 3, threshold: float = 1e-07)#

Detect constant values over a longer period.

Commonly caused by sensor failures, which get stuck at a constant level.

detect(data: pandas.core.series.Series) pandas.core.series.Series#

Detect anomalies

Parameters

data (pd.Series) – Time series data with possible anomalies

Returns

Time series with bools, True == anomaly

Return type

pd.Series

fit(data: pandas.core.series.Series)#

Set detector parameters based on data.

Parameters

data (pd.Series) – Normal time series data.

save(path: Union[str, pathlib.Path]) None#

Save a detector for later use

Parameters

path (str or Path) – file-like object to load detector from

validate(data: Union[pandas.core.series.Series, pandas.core.frame.DataFrame]) Union[pandas.core.series.Series, pandas.core.frame.DataFrame]#

Check that input data is in correct format and possibly adjust

class tsod.ConstantGradientDetector(window_size: int = 3)#

Detect constant gradients.

Typically caused by linear interpolation over a long interval.

Parameters

window_size (int) – Minium window to consider as anomaly, default 3

detect(data: pandas.core.series.Series) pandas.core.series.Series#

Detect anomalies

Parameters

data (pd.Series) – Time series data with possible anomalies

Returns

Time series with bools, True == anomaly

Return type

pd.Series

fit(data: pandas.core.series.Series)#

Set detector parameters based on data.

Parameters

data (pd.Series) – Normal time series data.

save(path: Union[str, pathlib.Path]) None#

Save a detector for later use

Parameters

path (str or Path) – file-like object to load detector from

validate(data: Union[pandas.core.series.Series, pandas.core.frame.DataFrame]) Union[pandas.core.series.Series, pandas.core.frame.DataFrame]#

Check that input data is in correct format and possibly adjust

class tsod.GradientDetector(max_gradient=inf, direction='both')#

Detects abrupt changes

Parameters
  • max_gradient (float) – Maximum rate of change per second, default np.inf

  • direction (str) – positive, negative or both, default=’both’

detect(data: pandas.core.series.Series) pandas.core.series.Series#

Detect anomalies

Parameters

data (pd.Series) – Time series data with possible anomalies

Returns

Time series with bools, True == anomaly

Return type

pd.Series

fit(data: pandas.core.series.Series)#

Set detector parameters based on data.

Parameters

data (pd.Series) – Normal time series data.

save(path: Union[str, pathlib.Path]) None#

Save a detector for later use

Parameters

path (str or Path) – file-like object to load detector from

validate(data: Union[pandas.core.series.Series, pandas.core.frame.DataFrame]) Union[pandas.core.series.Series, pandas.core.frame.DataFrame]#

Check that input data is in correct format and possibly adjust

class tsod.DiffDetector(max_diff=inf, direction='both')#

Detect sudden shifts in data. Irrespective of time axis.

Parameters
  • max_diff (float) – Maximum change threshold.

  • direction (str) – positive, negative or both, default=’both’

See also

GradientDetector

similar functionality but considers actual time between data points

detect(data: pandas.core.series.Series) pandas.core.series.Series#

Detect anomalies

Parameters

data (pd.Series) – Time series data with possible anomalies

Returns

Time series with bools, True == anomaly

Return type

pd.Series

fit(data: pandas.core.series.Series)#

Set detector parameters based on data.

Parameters

data (pd.Series) – Normal time series data.

save(path: Union[str, pathlib.Path]) None#

Save a detector for later use

Parameters

path (str or Path) – file-like object to load detector from

validate(data: Union[pandas.core.series.Series, pandas.core.frame.DataFrame]) Union[pandas.core.series.Series, pandas.core.frame.DataFrame]#

Check that input data is in correct format and possibly adjust

class tsod.CombinedDetector(detectors)#

Combine detectors.

It is possible to combine several anomaly detection strategies into a combined detector.

Examples

>>> normal_data = pd.Series(np.random.normal(size=100))
>>> abnormal_data = pd.Series(np.random.normal(size=100))
>>> abnormal_data[[2, 6, 15, 57, 60, 73]] = 5
>>> anomaly_detector = CombinedDetector([RangeDetector(), DiffDetector()])
>>> anomaly_detector.fit(normal_data)
>>> detected_anomalies = anomaly_detector.detect(abnormal_data)
count(value) integer -- return number of occurrences of value#
detect(data: pandas.core.series.Series) pandas.core.series.Series#

Detect anomalies

Parameters

data (pd.Series) – Time series data with possible anomalies

Returns

Time series with bools, True == anomaly

Return type

pd.Series

fit(data: pandas.core.series.Series)#

Set detector parameters based on data.

Parameters

data (pd.Series) – Normal time series data.

index(value[, start[, stop]]) integer -- return first index of value.#

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

save(path: Union[str, pathlib.Path]) None#

Save a detector for later use

Parameters

path (str or Path) – file-like object to load detector from

validate(data: Union[pandas.core.series.Series, pandas.core.frame.DataFrame]) Union[pandas.core.series.Series, pandas.core.frame.DataFrame]#

Check that input data is in correct format and possibly adjust

Hampel#

class tsod.hampel.HampelDetector(window_size=5, threshold=3)#

Hampel filter implementation that works on numpy arrays, implemented with numba.

Parameters
  • window_size (int) – The window range is from [(i - window_size):(i + window_size)], so window_size is half of the window, counted in number of array elements (as opposed to specify a time span, which is not supported by this implementation)

  • threshold (float) – The threshold for marking an outlier. A low threshold “narrows” the band within which values are deemed as outliers. n_sigmas, default=3.0

detect(data: pandas.core.series.Series) pandas.core.series.Series#

Detect anomalies

Parameters

data (pd.Series) – Time series data with possible anomalies

Returns

Time series with bools, True == anomaly

Return type

pd.Series

fit(data: pandas.core.series.Series)#

Set detector parameters based on data.

Parameters

data (pd.Series) – Normal time series data.

save(path: Union[str, pathlib.Path]) None#

Save a detector for later use

Parameters

path (str or Path) – file-like object to load detector from

validate(data: Union[pandas.core.series.Series, pandas.core.frame.DataFrame]) Union[pandas.core.series.Series, pandas.core.frame.DataFrame]#

Check that input data is in correct format and possibly adjust