Hands-On Time Series Analysis with Python Course: Time Series Data Analysis In Python [2/6]
A practical guide for time series data analysis in Python Pandas
This blog post is the second in a six-part series on hands-on time series analysis using Python. It serves as a practical guide to analyzing time series data within the Python Pandas environment.
The article delves into the fundamental concepts of correlation and autocorrelation, explaining how to measure the relationship between two time series and within a single time series across different time lags.
It then introduces various time series models, with a specific focus on Autoregressive (AR) models, which use past values to predict future ones. The discussion extends to Moving Average (MA) and combined ARMA models, providing a comprehensive overview of these foundational modeling techniques.
To solidify these concepts, the post includes a practical case study on climate change. This case study demonstrates the application of the discussed methods, including:
Converting data to a suitable time series format and initial plotting.
Testing for stationarity using the Augmented Dickey-Fuller test.
Transforming data to achieve stationarity through differencing.
Computing and interpreting Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to identify model parameters.
Fitting AR, MA, and ARMA models to the climate data.
This series will consist of 6 articles:
Manipulating Time Series Data in Python Pandas [A Practical Guide]
Time Series Analysis in Python Pandas [A Practical Guide]
Visualizing Time Series Data in Python [A Practical Guide]
Time Series Forecasting with ARIMA Models In Python [Part 1]
Time Series Forecasting with ARIMA Models In Python [Part 2]
Machine Learning for Time Series Data [Regression]

Table of contents:
Correlation and Autocorrelation
Time Series Models
Autoregressive (AR) Models
Moving Average (MA) and ARMA Models
Case Study: Climate Change
Get All My Books, One Button Away With 40% Off
I have created a bundle for my books and roadmaps, so you can buy everything with just one button and for 40% less than the original price. The bundle features 8 eBooks, including:
1. Correlation and Autocorrelation
In this section, you’ll be introduced to the ideas of correlation and autocorrelation for time series. Correlation describes the relationship between two time series, and autocorrelation describes the relationship of a time series with its past values.
1.1. Correlation of Two Time Series
The correlation of the two time series measures how they vary with each other. The correlation coefficient summarizes this relation in one number. A correlation of one means that the two series have a perfect linear relationship with no deviations.
High correlations mean that the two series strongly vary together. A low correlation means they vary together, but there is a weak association. And a high negative correlation means they vary in opposite directions, but still with a linear relationship.
There is a common mistake when calculating the correlation between two trending time series. Consider two time series that are both trending. Even if the two series are totally unrelated, you could still get a very high correlation. That’s why, when you look at the correlation of, say, two stocks, you should look at the correlation of their returns, not their levels.
In the example below, the two series, stock prices and UFO sightings, both trend up over time. Of course, there is no relationship between those two series, but the correlation is 0.94. But if you compute the correlation of percent changes, the correlation goes down to approximately zero.
# Compute correlation of levels
# data used is levels
levels = pd.read_csv('DJI.csv', parse_dates=['Date'], index_col='Date')
correlation1 = levels['DJI'].corr(levels['UFO'])
print("Correlation of levels: ", correlation1)
# Compute correlation of percent changes
changes = levels.pct_change()
correlation2 = changes['DJI'].corr(changes['UFO'])
print("Correlation of changes: ", correlation2)
The figure below shows that the two series are correlated when plotted with time. The reason for this, as mentioned, is that they are both trending series.
1.2. Simple Linear Regression
A simple linear regression for time series finds the slope, beta, and intercept, alpha, of a line that’s the best fit between a dependent variable, y, and an independent variable, x. The x’s and y’s can be two time series.
Keep reading with a 7-day free trial
Subscribe to To Data & Beyond to keep reading this post and get 7 days of free access to the full post archives.