Datetimes and timedeltas#

Python has several ways of representing datetimes and timedelta. This notebook shows the three most common ways and how to convert between them.

Our general advice: use pandas whenever you can.

from datetime import datetime, timedelta
import numpy as np
import pandas as pd

Datetime/timestamp#

The most common datetime representations in Python:

For string representations of datetimes use ISO 8601 (e.g. 2021-09-07T19:03:12Z) when possible.

See Python Pandas For Your Grandpa - 4.2 Dates and Times for a 18-min video introduction to three datetime representations (including time-zone handling).

datetime.datetime#

The build-in datetime representation is quite simple.

dt_dt = datetime(2018,1,1,19,3,1)
dt_dt
datetime.datetime(2018, 1, 1, 19, 3, 1)

NumPy: np.datetime64#

np.datetime64 is essentially an integer (np.int64) representing the time since epoch time 1970-01-01 00:00:00 in a specified unit e.g. days, seconds or nano-seconds.

dt_np = np.datetime64('2018-01-01 19:03:01')  # implicitly [s]
dt_np
numpy.datetime64('2018-01-01T19:03:01')
np.int64(dt_np)
1514833381
np.datetime64('1970-01-01 00:00:00') + np.int64(dt_np)
numpy.datetime64('2018-01-01T19:03:01')
dt_np.dtype.name
'datetime64[s]'
dt_np.astype(datetime)   # np.datetime64 -> datetime.datetime
datetime.datetime(2018, 1, 1, 19, 3, 1)

Pandas: pd.Timestamp#

pd.Timestamp uses np.datetime64[ns] under the hood. Pandas is good at recognizing various string representations of datetimes:

dt_pd = pd.Timestamp("2018/8/1")    # equivalent to pd.to_datetime()
dt_pd
Timestamp('2018-08-01 00:00:00')
dt_pd.to_numpy()         # pd.Timestamp -> np.datetime64
numpy.datetime64('2018-08-01T00:00:00.000000000')
dt_pd.to_pydatetime()    # pd.Timestamp -> datetime.datetime
datetime.datetime(2018, 8, 1, 0, 0)
pd.Timestamp(dt_np)      # np.datetime64 -> pd.Timestamp
Timestamp('2018-01-01 19:03:01')
pd.Timestamp(dt_dt)      # datetime.datetime -> pd.Timestamp
Timestamp('2018-01-01 19:03:01')

Timedeltas#

We often need to represent differences between two timestamps. The most common representations are:

Which corresponds to the above three representations of datetimes.

datetime.timedelta#

The Python build-in way of working with differences between two datetimes.

del_dt = timedelta(days=6)
del_dt
datetime.timedelta(days=6)
dt_dt + del_dt    # datetime.datetime + datetime.timedelta
datetime.datetime(2018, 1, 7, 19, 3, 1)
dt_dt2 = datetime(2018,2,3,11,3,1)
dt_dt2 - dt_dt    # datetime.datetime - datetime.datetime  
datetime.timedelta(days=32, seconds=57600)

Numpy: np.timedelta64#

np.timedelta64 is an int64 in a specific unit e.g. seconds or nanoseconds.

dt_np2 = np.datetime64('2018-02-02 16:21:11')
del_np = dt_np2 - dt_np     # np.datetime64 - np.datetime64
del_np
numpy.timedelta64(2755090,'s')
dt_np + del_np
numpy.datetime64('2018-02-02T16:21:11')
np.int64(del_np), np.dtype(del_np).name
(2755090, 'timedelta64[s]')
del_np.astype(timedelta)    # np.timedelta64 -> datetime.timedelta
datetime.timedelta(days=31, seconds=76690)

Pandas: pd.Timedelta#

dt_pd2 = pd.Timestamp("2018/8/4 23:01:03")
del_pd = dt_pd2 - dt_pd     # pd.Timedelta - pd.Timedelta
del_pd
Timedelta('3 days 23:01:03')
dt_pd + del_pd
Timestamp('2018-08-04 23:01:03')
del_pd.total_seconds()
342063.0
print(pd.Timedelta(del_dt))     # datetime.timedelta -> pd.Timedelta
print(pd.Timedelta(del_np))     # np.datetime64 -> pd.Timedelta
6 days 00:00:00
31 days 21:18:10
print(del_pd.to_pytimedelta())  # pd.Timedelta -> datetime.timedelta
print(del_pd.to_timedelta64())  # pd.Timedelta -> np.timedelta64
3 days, 23:01:03
342063000000000 nanoseconds

Datetime ranges#

Pandas is very powerful for vectors of datetimes. Use the pd.date_range() method for creating a pd.DatetimeIndex

dti = pd.date_range('2018', periods=8, freq='5D')
dti
DatetimeIndex(['2018-01-01', '2018-01-06', '2018-01-11', '2018-01-16',
               '2018-01-21', '2018-01-26', '2018-01-31', '2018-02-05'],
              dtype='datetime64[ns]', freq='5D')
tdi = dti - dti[0]
tdi
TimedeltaIndex([ '0 days',  '5 days', '10 days', '15 days', '20 days',
                '25 days', '30 days', '35 days'],
               dtype='timedelta64[ns]', freq=None)
tdi.total_seconds().to_numpy()
array([      0.,  432000.,  864000., 1296000., 1728000., 2160000.,
       2592000., 3024000.])

Slicing with DatetimeIndex#

df = pd.DataFrame(np.ones(8), index=dti, columns=['one'])
df
one
2018-01-01 1.0
2018-01-06 1.0
2018-01-11 1.0
2018-01-16 1.0
2018-01-21 1.0
2018-01-26 1.0
2018-01-31 1.0
2018-02-05 1.0
df.loc["2018-01-05":"2018-01-21"]  # notice that end of slice is included!!!
one
2018-01-06 1.0
2018-01-11 1.0
2018-01-16 1.0
2018-01-21 1.0
df.loc["2018-01-26":]
one
2018-01-26 1.0
2018-01-31 1.0
2018-02-05 1.0
df.loc["2018-02"]
one
2018-02-05 1.0