11.2 Time Series Data Overview
To perform forecasting in Python, we will need to use time series data. Time series data involves measurements that contain both numeric values and a meaningful timestamp associated with each of those values. That timestamp might be a date, a week, a month, a year, or any other repeating period. By including information about not only what happened, but also about when it happened, time series data lets us better understand a particular phenomenon that we want to study across a period, such as stock price movement, weather patterns, or demographic trends.
A time series is sometimes referred to as “ordered data.” In most other datasets that we typically encounter, including the majority of those included in this book, we are using cross-sectional data.
Before we go much further into time series analysis, let’s take a moment to define some important time series-related terms:
Level: The average value of the time series.
Trend: The direction of movement in the data across time. A time series can show an uptrend, a downtrend, or no trend. Note that the term ‘trend’ has no single, precise definition.
Seasonality: A pattern of activity in a time series that is repeated at regular intervals.
Cyclicality: A pattern of activity in a time series that involves a repeating pattern that occurs across irregular intervals.
Noise: The random variation in a time series that is caused by either measurement error or irregular movement in the value being measured.
Autocorrelation: A phenomenon that occurs when values in a particular time series are correlated with neighboring values.
Stationarity: A time series can be said to exhibit stationarity when its mean, variance, and autocorrelation remain consistent throughout.