Some of the methods for doing forecasting in Business and Economics are (1) Exponential Smoothing Technique (2) Single Equation Regression Technique (3) Simultaneous-equation Regression Method (4) Autoregressive Integrated Moving Average (ARIMA) Models (5) Vector Autoregression (VAR) Method

The lecture will demonstrate the ARIMA which is purely univariable method of forecasting. The main philosophy here is: “Let the data speak for itself”

The lecture will cover both the background theorems and its execution through R. In this post, we will mainly discuss some theoretical foundation only and in the next few posts, we will discuss the practical aspects of ARIMA.

ARIMA is an acronym that stands for AutoRegressive Integrated Moving Average. So, it is necessary to know the underlying properties of AutoRegressive(AR), Moving Average (MA) and order of integration.

Autoregressive (AR) Process

We start with \(Y_t\) which is non-stationary in nature (for example, GDP of India, Stock market indices etc.).

Let \(Y_t\) is modelled as:

$$ Y_t =\delta+\alpha_1Y_{t-1}+u_t~~~~~~~~\dots(1)\\where~~\delta=\text {any constant}\\u_t=\text {white noise} $$

The value of \(Y\) at time \(t\) depends on its value in the previous time period and a random term. In other words, this model says that the forecast value of $Y$ at time \(t\) is simply some proportion \((=\alpha_1 )\) of its value at time $(t-1)$ plus a random shock or disturbance at time \(t\).

For the stationarity of the series, it is required that \(|\alpha_1|<1\).

Again if we write the following model:

$$ Y_t =\delta+\alpha_1Y_{t-1}+\alpha_2Y_{t-2}+u_t~~~~~~~~\dots(2) ~~~~~~~~\implies AR(2) ~~Process~$$

That is, the value of \(Y\) at time \(t\) depends on its value in the previous two time periods.

Following this fashion, we can write:

$$ Y_t=\delta+\alpha_1Y_{t-1}+\alpha_2Y_{t-2}+…….+\alpha_pY_{t-1}+u_t~~~~~~~~\dots(3) ~~~~~~~~\implies AR(p) ~~Process~$$

Properties of AR(1) Process:

The mean is given as:

$$\begin {aligned} E(Y_t)&=E( \delta+\alpha_1 Y_{t-1}+u_t ) \\ E( Y_t )&= E( \delta) +E( \alpha_1 Y_{t-1} )+E( u_t )\\E( Y_t )&=\delta+\alpha_1 E( Y_{t-1} )+0 \end {aligned} $$

Assuming that the series is stationary, \(E(Y_t)=E(Y_{t-1})=\mu ~~\text {(common mean)},\)

$$ \begin {aligned} \mu&=\delta+\alpha_1 \mu\\\implies \mu&=\frac {\delta}{1-\alpha_1} \end {aligned} $$

The variance is calculated as follows:

By independence of errors term and values of \(Y_t\):

$$\begin {aligned} Var(Y_t) &= Var (\delta)+Var(\phi_1 Y_{t-1})+Var(u_t)\\ Var(Y_t)&=\phi_1^2 Var(Y_{t-1})+\sigma^2_u \end {aligned} $$

By stationary assumption, \(Var(Y_t)=Var(Y_{t-1})\) and substituting this, you will get:

$$ (1-\phi_1^2)>0 ~~~~~~~since ~~Var (Y_t)>0\\\implies |\phi_1|<1 $$

ACF, PACF and Correlogram:

The ACF at lag \(k\) is defined as:

$$ \rho_k=\frac {\text {Covariance at lag k}}{variance}=\frac {\gamma_k}{\gamma_0} $$

Since both covariance and variance are measured in the same units of measurement, \(\rho_k\)
is a unitfree, or pure, number. It lies between −1 and +1, as any correlation coefficient does.
If we plot \(\rho_k\) against \(k\), the graph we obtain is known as the correlogram. It helps us to identify the stationarity of the time series data.

For PACF, it is the partial autocorrelation which are plotted against the number of lag taken.

Moving Average (MA) Process

Let us write \(Y_t\) as follows:

$$ Y_t= \mu+\beta_0u_t+\beta_1 u_{t-1} ~~~~~~~~\dots (4) ~~~~~~~~\implies MA(1)~~ Process \\where~~~\mu= constant~~~~and ~~u_t=\text {white noise term} $$

That is, \(Y_t\) at time \(t\) is equal to a constant \((\mu)\) plus a moving average \((\beta_0u_t+\beta_1 u_{t-1})\) of the current and past error terms.

So, the \(MA(2)\) process can be written as:

$$ Y_t= \mu+\beta_0u_t+\beta_1 u_{t-1}+\beta_2 u_{t-2} ~~~~~~~~\dots (5) ~~~~~~~~\implies MA(2)~~ Process $$

The \(MA(q)\) process can be written as:

$$ Y_t= \mu+\beta_0u_t+\beta_1 u_{t-1}+…….+\beta_q u_{t-q} ~~~~~~~~\dots (6) ~~~~~~~~\implies MA(q)~~ Process $$

Theoretical Properties of a Time Series with an MA(1) Model

$$\begin {aligned} Mean &=E(y_t)=\mu\\Variance&=var(y_t)=\sigma_{\mu}^2 (1+\beta_1^2)\\&\text {The Autocorrelation fucntion (ACF):}\\&\rho_1=\frac {\beta_1}{1+\beta_1^2}~~~\&~~\rho_q=0~~ \text {for all}~~q\ge2 \end {aligned} $$

Note that the only nonzero value in the theoretical ACF is for lag 1. All other autocorrelations are 0. Thus a sample ACF with a significant autocorrelation only at lag 1 is an indicator of a possible $MA(1)$ model.

Theroretical Properties of a Time Series with an MA(2) Model

$$\begin {aligned} Mean &=E(y_t)=\mu\\Variance&=var(y_t)=\sigma_{\mu}^2 (1+\beta_1^2+\beta_2^2)\\&\text {Autocorrelation function (ACF) is:}\\\rho_1&=\frac {1+\beta_1\beta_2}{1+\beta_1^2+\beta_2^2}~~~\&\rho_2=\frac {\beta_2}{1+\beta_1^2+\beta_2^2}~~\rho_k=0~~ \text {for all}~~q\ge3 \end {aligned} $$

Note that the only nonzero values in the theoretical ACF are for lags 1 and 2. Autocorrelations for higher lags are 0. So, a sample ACF with significant autocorrelations at lags 1 and 2, but non-significant autocorrelations for higher lags indicates a possible \(MA(2)\) model.

Autoregressive and Moving Average (ARMA) Process

If it is likely that \(Y\) has characteristics of both AR and MA and it is called ARMA. Thus, \(Y_t\) follows an \(ARMA(1, 1)\) process if it can be written as:

$$ Y_t = \theta + \alpha_1 Y_{t-1}+\beta_0 u_t+\beta_1 u_{t-1}~~~~~~~~~~\dots (7)~~~~~~\implies ARMA(1,1) $$

So, the \(ARMA(p,q)\) can be written as:

$$ Y_t = \theta + \alpha_1 Y_{t-1}+………\alpha_p Y_{t-p}+\beta_0 u_t+\beta_1 u_{t-1}+……..+\beta_q u_{t-q}~~~~~~~~~~\dots (8)~~~~~~\implies ARMA(p,q) $$

Autoregressive Integrated Moving Average (ARIMA) Process

The earlier models of time series are based on the assumptions that the time series variable is stationary (at least in the weak sense).

But in practical, most of the time series variables will be non-stationary in nature and they are intergrated series.

This implies that you need to take either the first or second difference of the non-stationary time series to convert them into stationary.

As such they may \(I(1)\) or \(I(2)\) and so on.

Therefore, if you have to difference a time series \(d\) times to make it stationary and then apply the \(ARMA( p, q)\) model to it, it is said that

– the original time series is \(ARIMA( p, d, q)\), that is, it is an ARIMA series
where \(p\) denotes the number of autoregressive terms,
\(d\) the number of times the series has to be differenced before it becomes stationary, and
\(q\) the number of moving average terms.

Box-Jenkins (BJ) Methodology

First of all, you must have to identify the particular series is stationary or if not the order of integration to convert it to stationary. That is, you have to identity the ARIMA series.

The BJ Methods can answer this question.

The steps in BJ-Methods are as under:

Step 1: Examine the Data

As a starting point it is always advisable to examine the data visually before going for details mathematical modeling. So, the examination of data involves the following things:

Step 2: Decompose your data

Though ARIMA can be fitted to both seasonal and non-seasonal data. Seasonal ARIMA requires a more complicated specification of the model structure, although the process of determining \((p, d, q)\) is similar to that of choosing non-seasonal order parameters.
Therefore, sometimes decomposing the data will be give additional benefit. So, we are seeking to answer the following questions in this step:

Step 3: Identification

The identification step involves the following:

Step 4: Estimation of the ARIMA Model

Step 5: Diagnostic checking

Step 6: Forecasting the Future Values