Time Series Models Overview (1) - ARIMA | Bangda Sun

ARIMA Family.

1. Introduction

Recently I’m working on Anomaly Detection Problem. Our team is using a developed framework called NBM (Normal Behavior Modeling), intuitively we model on normal data - then when there are abnormal data, the model should tell us it get something different. There are many models could work: One-class SVM, AutoEncoder, Clustering, etc. At the very beginning we cannot get acceptable results using Basic AutoEncoder and Denoising AutoEncoder in NBM and we decide to find alternative models, and my direction is using classical statistical time series models. I think it’s reasonable for following reasons:

Some time series models assume the data to be stationary, which could be one of the assumption for normal data;
Most of the time series models are simple (linear), which could help us understand the relationships among features and easy to interpret.

2. ARIMA Family

AR - Autoregressive Model

Autoregressive model is simply a regression model plus a sequence model, where the output depends linearly on its own previous values. We denote a autoregressive model of order \(p\) as \(AP(p)\),

\[
Y_{t} = \psi_{0} + \sum^{p}_{i=1}\psi_{i}Y_{t-i} + \epsilon_{t},
\]

where \(\psi_{i}\) are the parameters of the model, \(\epsilon_{t}\) is noise. If we denote a backshift operator \(B\) (or say lag operator), the model is equivalent with

\[
Y_{t} = \psi_{0} + \sum^{p}_{i=1}\psi_{i}B^{i}Y_{t} + \epsilon_{t}.
\]

With more mathematics, we can get

\[
\left(1 - \sum^{p}_{i=1}\psi_{i}B^{i}\right)Y_{t} = \psi_{0} + \epsilon_{t}.
\]

MA - Moving-average Model

For moving-average model, the output depends linearly on the current and previous value of a imperfectly predictable stochastic term. A Moving-average model with order \(q\) is denoted as \(MA(q)\),

\[
Y_{t} = \theta_{0} + \sum^{q}_{i=1}\theta_{i}\epsilon_{t-i} + \epsilon_{t},
\]

by applying backshift operator \(B\) we have

\[
Y_{t} = \theta_{0} + \sum^{q}_{i=1}\theta_{i}B^{i}\epsilon_{t} + \epsilon_{t} = \theta_{0} + \left(1 + \sum^{q}_{i=1}\theta_{i}B^{i}\right)\epsilon_{t}.
\]

ARMA - Autoregressive Moving-average Model

ARMA model is denoted as \(ARMA(p, q)\) and it contains \(AR(p)\) and \(MA(q)\) models:

\[
Y_{t} = C + \sum^{p}_{i=1}\psi_{i}Y_{t-i} + \sum^{q}_{i=1}\theta_{i}\epsilon_{t-i} + \epsilon_{t}.
\]

ARIMA - Autoregressive Integrated Moving-average Model

ARIMA model is based on ARMA model, the difference is the data is the difference between current value and previous value. For example, if the data has linear trend, take a difference of the data will filter out the linear trend which satisfy the assumption of ARMA model. The model denoted as \(ARIMA(p, d, q)\), where \(d\) denote the number of difference operation.

\[
\nabla^{d} Y_{t} = C + \sum^{p}_{i=1}\psi_{i}\nabla^{d} Y_{t-i} + \sum^{q}_{i=1}\theta_{i}\epsilon_{t-i} + \epsilon_{t},
\]

here \(\nabla^{d}\) is the difference operator, for example 1st order difference is

\[
\nabla Y_{t} = Y_{t} - Y_{t-1} = (1-B)Y_{t},
\]

then 2nd order difference is therefore represented as

\[
\begin{aligned}
\nabla^{2}Y_{t} &= \nabla(\nabla Y_{t}) = \nabla(Y_{t} - Y_{t-1}) = \nabla Y_{t} - \nabla Y_{t-1} \\
&= (1-B)\nabla Y_{t} = (1-B)^{2}Y_{t}.
\end{aligned}
\]

and higher order difference is therefore

\[
\nabla^{d}Y_{t} = (1-B)\nabla^{d-1}Y_{t} = (1-B)(1-B)^{d-1}Y_{t} = (1-B)^{d}Y_{t}.
\]

ARIMAX - Autoregressive Integrated Moving-average Model with eXogenous Features

As the model become more complex, we denote it as \(ARIMAX(p, d, q, b)\), given by

\[
\nabla^{d} Y_{t} = C + \sum^{b}_{i=1}\beta_{i}X_{i, t} + \sum^{p}_{i=1}\psi_{i}\nabla^{d} Y_{t-i} + \sum^{q}_{i=1}\theta_{i}\epsilon_{t-i} + \epsilon_{t},
\]

here the \(X_{i,t}\) is external time series, the base version is just use the current value, we can definitely add previous values of exogenous features.

SARIMAX - Seasonal Autoregressive Integrated Moving-average Model with eXogenous Features

SARIMAX model is denoted as \(SARIMAX(p, d, q, b)\times(P, D, Q, s)\), where \(s\) is periodicity of the time series. Here seasonal correction is applied by adding previous value with lag equal to multiple of periodicity (autoregressive) or previous noise term with lag equal to multiple of periodicity (moving-average). The seasonal AR and MA components in SARIMAX are

\[
\begin{aligned}
\text{Seasonal }AR(P):&~Y_{t} = \phi_{0} + \sum^{P}_{i=1}\phi_{i} Y_{t-is} + \epsilon_{t}, \\
\text{Seasonal }MA(Q):&~Y_{t} = \lambda_{0} + \sum^{Q}_{i=1}\lambda_{i} \epsilon_{t-is} + \epsilon_{t}.
\end{aligned}
\]

For example we have monthly data with monthly periodicity, we should set \(s=12\), \(SARIMAX(1, 0, 1, 0)\times (1, 0, 1, 12)\) is

\[
Y_{t} = C + \psi_{1}Y_{t-1} + \psi_{12}Y_{t-12} + \phi_{1}\epsilon_{t-1} + \phi_{12}\epsilon_{t-12} + \epsilon_{t}.
\]

The general form of \(SARIMAX(p, d, q, b)\times(P, D, Q, s)\) is

\[
\begin{aligned}
\nabla^{D}\nabla^{d}Y_{t} =&~ C + \sum^{p}_{i=1}\psi_{i}\nabla^{D}\nabla^{d}Y_{t-i} + \sum^{q}_{i=1}\theta_{i}\epsilon_{t-i} + \epsilon_{t} \\
+&~ \sum^{P}_{i=1}\psi_{is}\nabla^{D}\nabla^{d}Y_{t-is} + \sum^{Q}_{i=1}\theta_{is}\epsilon_{t-is} \\
+&~ \sum^{b}_{i=1}\beta_{i}X_{i,t}.
\end{aligned}
\]

To be continued …