We have an ACF plot.

In simple terms, it describes how well the present value of the series is related with its past values.

A time series can have components like trend, seasonality, cyclic and residual.

ACF considers all these components while finding correlations hence it’s a ‘complete auto-correlation plot’.

PACF is a partial auto-correlation function.

Basically instead of finding correlations of present with lags like ACF, it finds correlation of the residuals (which remains after removing the effects which are already explained by the earlier lag(s)) with the next lag value hence ‘partial’ and not ‘complete’ as we remove already found variations before we find the next correlation.

So if there is any hidden information in the residual which can be modeled by the next lag, we might get a good correlation and we will keep that next lag as a feature while modeling.

Remember while modeling we don’t want to keep too many features which are correlated as that can create multicollinearity issues.

Hence we need to retain only the relevant features.

Now let’s see what is an AR and MA time series process,Auto regressive (AR) process , a time series is said to be AR when present value of the time series can be obtained using previous values of the same time series i.

e the present value is weighted average of its past values.

Stock prices and global temperature rise can be thought of as an AR processes.

The AR process of an order p can be written as,Where ϵt is a white noise and y’t-₁ and y’t-₂ are the lags.

Order p is the lag value after which PACF plot crosses the upper confidence interval for the first time.

These p lags will act as our features while forecasting the AR time series.

We cannot use the ACF plot here because it will show good correlations even for the lags which are far in the past.

If we consider those many features, we will have multicollinearity issues.

This is not a problem with PACF plot as it removes components already explained by earlier lags, so we only get the lags which have the correlation with the residual i.

e the component not explained by earlier lags.

In the below code, I have defined a simple AR process and found its order using the PACF plot.

We should expect our AR process to show gradual decrease in the ACF plot since being an AR process its present has good correlation with the past lags.

We expect PACF to have sharp fall after near lags as these lags near to present can capture the variation so well that we don’t need past lags to predict present.

Now let’s discuss the second type of process,Moving average (MA) process, a process where the present value of series is defined as a linear combination of past errors.

We assume the errors to be independently distributed with the normal distribution.

The MA process of order q is defined as ,Here ϵt is a white noise.

To get intuition of MA process lets consider order 1 MA process which will look like,let’s consider y’t as the crude oil price and ϵt is the change in the oil price due to hurricane.

Assume that c=10 (mean value of crude oil price when there is no hurricane) and θ₁=0.

5.

Suppose, there is a hurricane today and it was not present yesterday, so y’t will be 15 assuming the change in the oil price due to hurricane as ϵt=5.

Tomorrow there is no hurricane so y’t will be 12.

5 as ϵt=0 and ϵt-₁=5 .

Suppose there is no hurricane day after tomorrow.

In that case the oil price would be 10 which means it got stabilized back to mean after getting varied by hurricane.

So the effect of hurricane only stays for one lagged value in our case.

Hurricane in this case is an independent phenomenon.

Order q of the MA process is obtained from the ACF plot, this is the lag after which ACF crosses the upper confidence interval for the first time.

As we know PACF captures correlations of residuals and the time series lags, we might get good correlations for nearest lags as well as for past lags.

Why would that be?.Since our series is linear combination of the residuals and none of time series own lag can directly explain its present (since its not an AR), which is the essence of PACF plot as it subtracts variations already explained by earlier lags, its kind of PACF losing its power here!.On the other hand being a MA process, it doesn’t have the seasonal or trend components so the ACF plot will capture the correlations due residual components only.

You can also think of it as ACF which is a complete plot (capturing trend, seasonality, cyclic and residual correlations) acting as a partial plot since we don’t have trends, seasons, etc.

In the below code, I have defined a simple MA process and found its order using the ACF plot.

We can expect the ACF plot to show good correlation with nearest lags and then sharp fall as its not an AR process to have good correlation with past lags.

Also we would expect the PACF plot to have gradual decrease as being an MA process, nearest lag values of time series cannot really predict its present value unlike AR process.

So, we will get good correlations of residuals with further lags as well, hence the gradual decrease.

To summarize,While building a machine learning model we should avoid multicollinear features.

The same applies to time series models as well.

We find optimum features or order of the AR process using the PACF plot, as it removes variations explained by earlier lags so we get only the relevant features.

We find optimum features or order of the MA process using the ACF plot, as being an MA process it doesn’t have seasonal and trend components so we get only the residual relationship with the lags of time series in the ACF plot.

ACF acting as a partial plot.

I hope you liked the article, I have tried to keep it as simple as possible.

As this being my first article on Medium I would really appreciate feedback and suggestions to improve myself in the future posts.

Thanks for reading and happy learning!.See ya!.