Report an error

11.5 Converting a “Regular” pandas Dataframe into a Time Series

Most datasets included in this book are cross-sectional, meaning that they capture information taken at a single point in time. Time series data, by contrast, is longitudinal, meaning that its observations are indexed to particular time stamps.

Let’s import lobsterland_2020, a dataset that contains information from the park’s operations during the 2020 season.

When we import this dataset in the manner shown below, note that the dataset’s rows are indexed by sequential values from 0 to 105.

When we peek at the dataset with the head() function, we can see that each observation represents one day’s worth of operations. The ‘DATE’ variable is not the index yet, however – it’s just listed alongside all of the other columns in the dataframe.

Let’s add year information first, to make it a complete date.

With the code below, we will instruct pandas to set the ‘DATE’ column as the index for this dataframe. When we call the info() function, we can now see that index is a series of 106 dates, spanning from May 25, 2020 to September 7, 2020.

Now, when we view the dataframe, note the way it looks different from before:

Alternatively, if we already know which column contains the date information, we can address both of these issues at once when we read the dataset into our environment.