Dates and times¶
The datetime
module, which is part of the standard library, is used for working with dates and times
in Python. This module provides a number of classes and functions for working with dates and times,
and is the basis for the date and time functionality in pandas.
The datetime
module provides three classes for working with dates and times:
datetime.date
- for working with dates in isolation (which look likeYYYY-MM-DD
)datetime.time
- for working with times in isolation (which look likeHH:MM:SS
)datetime.datetime
- for working with dates and times together (which look likeYYYY-MM-DD HH:MM:SS
)
The datetime
module also provides a datetime.timedelta
class for representing
durations of time.
Importing the datetime module¶
We can import the datetime
module with the following statement:
The datetime module name and this class name are the same (that's unfortunate), so we can't import the class and module name directly without causing a name conflict. However, as always we can import only the classes and functions we need from the module:
Creating date and time objects¶
We can create date
, time
and datetime
objects using the date()
, time()
and datetime()
constructors
respectively. Each of these constructors takes a number of arguments, which are used to initialise the object.
Creating date objects¶
The date()
constructor takes three arguments:
year
- the year as an integermonth
- the month as an integer (1-12)day
- the day as an integer (1-31)
Creating time objects¶
The time()
constructor takes four arguments:
hour
- the hour as an integer (0-23)minute
- the minute as an integer (0-59)second
- the second as an integer (0-59)microsecond
- the microsecond as an integer (0-999999)
Creating datetime objects¶
The datetime()
constructor takes seven arguments:
year
- the year as an integermonth
- the month as an integer (1-12)day
- the day as an integer (1-31)hour
- the hour as an integer (0-23)minute
- the minute as an integer (0-59)second
- the second as an integer (0-59)microsecond
- the microsecond as an integer (0-999999)
Note
Only the year
, month
and day
arguments are required. The other arguments default to 0
.
The datetime()
constructor also takes a tzinfo
argument, which is used to specify the time zone
of the datetime
object. This parameter is also optional, but if you don't specify a tzinfo
argument (known as naive time), then the datetime
object will be created in the local time zone. This
is usually a bad idea that can lead to many problems
We will discuss time zones in more detail in the time zones section.
Working with date and time objects¶
Once we have created a date
, time
or datetime
object, we can access the individual components
of the object using the following attributes:
year
- the year as an integermonth
- the month as an integer (1-12)day
- the day as an integer (1-31)hour
- the hour as an integer (0-23)minute
- the minute as an integer (0-59)second
- the second as an integer (0-59)microsecond
- the microsecond as an integer (0-999999)tzinfo
- the time zone as atzinfo
object
from datetime import date
d = date(2018, 1, 1)
print(d.year)
print(d.month)
print(d.day)
# Output:
# 2018
# 1
# 1
We can also use the strftime()
method to format a date
, time
or datetime
object as a string:
Parsing date and time strings¶
To do the opposite, and parse a string into a date
, time
or datetime
object, we can use the strptime()
function.
The strptime()
function takes two arguments:
date_string
- the string to parseformat
- the format of the string to parse
The format
argument is a string that specifies the format of the string to parse. The format string
uses the same directives as the strftime()
method.
from datetime import datetime
str_date = '2018-01-01 12:30:00'
dt = datetime.strptime(str_date, '%Y-%m-%d %H:%M:%S')
print(dt)
# Output: 2018-01-01 12:30:00
Difference between two dates or times¶
To calculate the difference between two date
, time
or datetime
objects, we can use the -
operator.
The result of this operation is a timedelta
object, which represents the time difference between the two objects.
from datetime import date
d1 = date(2018, 1, 1)
d2 = date(2018, 1, 2)
dt = d2 - d1
print(dt)
# Output: 1 day, 0:00:00
Timedeltas can be added to or subtracted from date
, time
or datetime
objects using the +
and -
operators.
from datetime import date, timedelta
d = date(2018, 1, 1)
dt = d + timedelta(days=1)
print(dt)
# Output: 2018-01-02
Note
However, we can't sum two different date
, time
or datetime
objects together. The sum only works
when adding a timedelta
to a date
, time
or datetime
object.
Working with time zones¶
The datetime
module provides a tzinfo
class for representing time zones. However, this class
is an abstract base class (i.e., its methods are empty), and to use it we would need to implement
it. The pytz
module (a module that needs to be installed, since it doesn't come with Python)
provides a concrete implementation of the tzinfo
class, which we can use to represent time zones.
For example, to define a datetime
object in the UTC time zone and convert it to european summer time,
we can do the following:
from datetime import datetime
import pytz
dt_1 = datetime(2018, 1, 1, 12, 30, 0, 0, tzinfo=pytz.utc)
dt_2 = dt_1.astimezone(pytz.timezone('Europe/Paris'))
print(dt_1)
print(dt_2)
# Output:
# 2018-01-01 12:30:00+00:00
# 2018-01-01 13:30:00+01:00
In this example, the astimezone()
method converts a datetime
object from one time zone to another.
Working with dates and times in pandas¶
The pandas library provides a number of data structures for working with dates and times. The most
commonly used are the Timestamp
and DatetimeIndex
classes.
The Timestamp class¶
The Timestamp
class represents a single date and time, and is very similar to the datetime
class.
To create a Timestamp
object, we can use the to_datetime()
function, which takes a string or
a number of arguments, which are used to initialise the object.
import pandas as pd
df = pd.DataFrame({'date': ['2018-01-01 12:30:00']})
df['date'] = pd.to_datetime(df['date'])
print(df['date'].dtype)
# Output: datetime64[ns]
This class provides a number of methods for working with dates and times. For example, we can use the
strftime()
method to format a Timestamp
object as a string (like the datetime
class), or we can
use the year
, month
, day
, hour
, minute
, second
and microsecond
attributes to access the
individual components of the object.
import pandas as pd
df = pd.DataFrame({'date': ['2018-01-01 12:30:00']})
df['date'] = pd.to_datetime(df['date'])
print(df['date'].dt.weekday_name)
# Output: 0 Monday
# Name: date, dtype: object
The DatetimeIndex class¶
When we have a date or datetime column in a pandas DataFrame, we can set is as the index of the DataFrame
using the set_index()
method. This will create a DatetimeIndex
object, which is used to index the
DataFrame.
Note
Using the dates as index is only possible if the dates are unique, and is only useful if we want to select rows by date. Otherwise, we can just leave the dates as a normal column in the DataFrame.
import pandas as pd
df = pd.DataFrame({'date': ['2018-01-01 12:30:00', '2018-01-02 12:30:00', '2018-01-03 12:30:00']})
df.set_index('date', inplace=True)
print(df.index)
# Output
# DatetimeIndex(['2018-01-01 12:30:00', '2018-01-02 12:30:00',
# '2018-01-03 12:30:00'],
# dtype='datetime64[ns]', name='date', freq=None)
Note
Setting the index of a DataFrame to a DatetimeIndex
object is a very common operation, so pandas
provides a parse_dates
argument for the read_csv()
function, which can be used to automatically
parse date and time columns into a DatetimeIndex
object.
We can also create a DatetimeIndex
with the pd.date_range()
function. This function takes a number
of arguments, which are used to create a range of dates. For example, to create a DatetimeIndex
with
the dates from 2018-01-01 to 2018-01-10, we can do the following:
import pandas as pd
index = pd.date_range('2018-01-01', '2018-01-20', freq='W')
print(index)
# Output:
# DatetimeIndex(['2018-01-07', '2018-01-14'], dtype='datetime64[ns]', freq='W-SUN')
Note
The freq
argument specifies the frequency of the dates. In this example, we have used W
to
specify weekly frequency, and W-SUN
to specify weekly frequency with the week ending on Sunday.