# Datetimes and timedeltas

Python has several ways of representing datetimes and timedelta. This notebook shows the three most common ways and how to convert between them.

Our general advice: *use pandas whenever you can*.

In [None]:
from datetime import datetime, timedelta
import numpy as np
import pandas as pd

## Datetime/timestamp

The most common datetime representations in Python:

* [datetime.datetime](https://docs.python.org/3/library/datetime.html#datetime-objects) (Python build-in)
* [pd.Timestamp](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html)
* [np.datetime64](https://numpy.org/doc/stable/reference/arrays.datetime.html)

For string representations of datetimes use [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) (e.g. 2021-09-07T19:03:12Z) when possible.

See [Python Pandas For Your Grandpa - 4.2 Dates and Times](https://www.youtube.com/watch?v=2VyOsBTWLOI) for a 18-min video introduction to three datetime representations (including time-zone handling).

### datetime.datetime

The build-in datetime representation is quite simple. 

In [None]:
dt_dt = datetime(2018,1,1,19,3,1)
dt_dt

### NumPy: np.datetime64

np.datetime64 is essentially an integer (np.int64) representing the time since [epoch time](https://en.wikipedia.org/wiki/Unix_time) 1970-01-01 00:00:00 in a specified **unit** e.g. days, seconds or nano-seconds.  

In [None]:
dt_np = np.datetime64('2018-01-01 19:03:01')  # implicitly [s]
dt_np

In [None]:
np.int64(dt_np)

In [None]:
np.datetime64('1970-01-01 00:00:00') + np.int64(dt_np)

In [None]:
dt_np.dtype.name

In [None]:
dt_np.astype(datetime)   # np.datetime64 -> datetime.datetime

### Pandas: pd.Timestamp

pd.Timestamp uses np.datetime64[ns] under the hood. Pandas is good at recognizing various string representations of datetimes:

In [None]:
dt_pd = pd.Timestamp("2018/8/1")    # equivalent to pd.to_datetime()
dt_pd

In [None]:
dt_pd.to_numpy()         # pd.Timestamp -> np.datetime64

In [None]:
dt_pd.to_pydatetime()    # pd.Timestamp -> datetime.datetime

In [None]:
pd.Timestamp(dt_np)      # np.datetime64 -> pd.Timestamp

In [None]:
pd.Timestamp(dt_dt)      # datetime.datetime -> pd.Timestamp

## Timedeltas

We often need to represent differences between two timestamps. The most common representations are: 

* [datetime.timedelta](https://docs.python.org/3/library/datetime.html#timedelta-objects)
* [pd.Timedelta](https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html)
* [np.timedelta64](https://numpy.org/doc/stable/reference/arrays.datetime.html)

Which corresponds to the above three representations of datetimes. 

### datetime.timedelta

The Python build-in way of working with differences between two datetimes.

In [None]:
del_dt = timedelta(days=6)
del_dt

In [None]:
dt_dt + del_dt    # datetime.datetime + datetime.timedelta

In [None]:
dt_dt2 = datetime(2018,2,3,11,3,1)
dt_dt2 - dt_dt    # datetime.datetime - datetime.datetime  

### Numpy: np.timedelta64

np.timedelta64 is an int64 in a specific unit e.g. seconds or nanoseconds.

In [None]:
dt_np2 = np.datetime64('2018-02-02 16:21:11')
del_np = dt_np2 - dt_np     # np.datetime64 - np.datetime64
del_np

In [None]:
dt_np + del_np

In [None]:
np.int64(del_np), np.dtype(del_np).name

In [None]:
del_np.astype(timedelta)    # np.timedelta64 -> datetime.timedelta

### Pandas: pd.Timedelta

In [None]:
dt_pd2 = pd.Timestamp("2018/8/4 23:01:03")
del_pd = dt_pd2 - dt_pd     # pd.Timedelta - pd.Timedelta
del_pd

In [None]:
dt_pd + del_pd

In [None]:
del_pd.total_seconds()

In [None]:
print(pd.Timedelta(del_dt))     # datetime.timedelta -> pd.Timedelta
print(pd.Timedelta(del_np))     # np.datetime64 -> pd.Timedelta

In [None]:
print(del_pd.to_pytimedelta())  # pd.Timedelta -> datetime.timedelta
print(del_pd.to_timedelta64())  # pd.Timedelta -> np.timedelta64

## Datetime ranges

Pandas is very powerful for vectors of datetimes. Use the pd.date_range() method for creating a pd.DatetimeIndex 

In [None]:
dti = pd.date_range('2018', periods=8, freq='5D')
dti

In [None]:
tdi = dti - dti[0]
tdi

In [None]:
tdi.total_seconds().to_numpy()

### Slicing with DatetimeIndex

In [None]:
df = pd.DataFrame(np.ones(8), index=dti, columns=['one'])
df

In [None]:
df.loc["2018-01-05":"2018-01-21"]  # notice that end of slice is included!!!

In [None]:
df.loc["2018-01-26":]

In [None]:
df.loc["2018-02"]