Time Series for Quants: ARIMA and GARCH

Medium·22 min read
Statistical / ML for QuantsTime SeriesARIMAGARCHVolatility Forecasting

Setup

Why Time Series Models Matter in Quant Finance

Time series methods appear in two distinct roles on a quant desk:

  1. Return prediction: Can past prices or returns predict future ones? ARIMA provides a rigorous framework for modelling serial dependence in levels and returns — and usually confirms that equity daily returns have almost none.
  2. Volatility forecasting: Volatility is serially correlated even when returns are not. GARCH models capture volatility clustering — the empirical observation that large moves tend to cluster in time — and are essential for option pricing under stochastic vol, VaR estimation, and risk-adjusted sizing.

Conventions throughout. All rates of return are continuously compounded: rt=ln(Pt/Pt1)r_t = \ln(P_t / P_{t-1}). Volatility σt\sigma_t is annualised unless stated otherwise. Daily returns assumed to have 252 trading days per year. All series assumed to be observed at equally spaced intervals.


Theory

1. Stationarity

A time series (rt)tZ(r_t)_{t \in \mathbb{Z}} is weakly stationary (covariance stationary) if:

  1. E[rt]=μ<\mathbb{E}[r_t] = \mu < \infty for all tt,
  2. Var(rt)=σ2<\text{Var}(r_t) = \sigma^2 < \infty for all tt,
  3. Cov(rt,rtk)=γ(k)\text{Cov}(r_t, r_{t-k}) = \gamma(k) depends only on lag kk, not on tt.

Why it matters. Statistical inference on time series requires stationarity: parameter estimates have no asymptotic meaning for non-stationary series. Log-prices lnPt\ln P_t are typically non-stationary (unit root); log-returns rt=ΔlnPtr_t = \Delta \ln P_t are typically stationary.

Unit root test. The Augmented Dickey-Fuller (ADF) test tests H0H_0: unit root present (non-stationary) against HAH_A: stationary. Reject H0H_0 at the 5% level if the ADF test statistic is below the critical value (approximately 2.86-2.86 for no constant, 3.43-3.43 for constant and trend).

2. ARMA Models

AR(pp) — Autoregressive. The series depends linearly on its own past pp values:

rt=c+ϕ1rt1++ϕprtp+εt,εtiid(0,σ2).r_t = c + \phi_1 r_{t-1} + \cdots + \phi_p r_{t-p} + \varepsilon_t, \qquad \varepsilon_t \overset{\text{iid}}{\sim} (0, \sigma^2).

Using the lag operator LL (Lkrt=rtkL^k r_t = r_{t-k}), write as Φ(L)rt=c+εt\Phi(L) r_t = c + \varepsilon_t where Φ(z)=1ϕ1zϕpzp\Phi(z) = 1 - \phi_1 z - \cdots - \phi_p z^p. The process is stationary iff all roots of Φ(z)=0\Phi(z) = 0 lie outside the unit circle.

MA(qq) — Moving Average. The series is a linear combination of current and past shocks:

rt=μ+εt+θ1εt1++θqεtq=μ+Θ(L)εt.r_t = \mu + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \cdots + \theta_q \varepsilon_{t-q} = \mu + \Theta(L)\varepsilon_t.

An MA(qq) process is always stationary. It is invertible (representable as an infinite AR) iff all roots of Θ(z)=0\Theta(z) = 0 lie outside the unit circle.

ARMA(p,qp, q): Φ(L)rt=c+Θ(L)εt.\Phi(L) r_t = c + \Theta(L) \varepsilon_t.

ARIMA(p,d,qp, d, q): Apply differencing dd times before fitting ARMA(p,qp, q). For d=1d=1: model applies to Δrt=rtrt1\Delta r_t = r_t - r_{t-1}. For daily equity returns, d=0d=0 is appropriate (returns are already stationary). For log-prices, d=1d=1 produces returns.

Model selection. Use information criteria:

  • AIC: 2(θ^)+2k-2\ell(\hat{\theta}) + 2k, where \ell is log-likelihood and kk is number of parameters.
  • BIC: 2(θ^)+klnT-2\ell(\hat{\theta}) + k\ln T.

BIC penalises complexity more heavily and is preferred when the true model is parsimonious. Inspect ACF (autocorrelation function) and PACF (partial ACF) to guide pp and qq choices.

3. GARCH: Generalised ARCH

Motivation. Equity returns rtr_t are approximately serially uncorrelated (ACF of rtr_t near zero), but rt2r_t^2 has significant positive autocorrelation at many lags. This is volatility clustering — Mandelbrot (1963) observed that "large changes tend to be followed by large changes, of either sign." ARCH/GARCH models this explicitly.

GARCH(p,qp, q) — Bollerslev (1986). Decompose the return as:

rt=μ+εt,εt=σtzt,ztiid(0,1),r_t = \mu + \varepsilon_t, \qquad \varepsilon_t = \sigma_t z_t, \qquad z_t \overset{\text{iid}}{\sim} (0,1),

where the conditional variance σt2\sigma_t^2 follows:

σt2=ω+i=1qαiεti2+j=1pβjσtj2.\sigma_t^2 = \omega + \sum_{i=1}^q \alpha_i \varepsilon_{t-i}^2 + \sum_{j=1}^p \beta_j \sigma_{t-j}^2.

Parameters and constraints.

  • ω>0\omega > 0, αi0\alpha_i \geq 0, βj0\beta_j \geq 0 — ensures σt2>0\sigma_t^2 > 0 a.s.
  • Stationarity: i=1qαi+j=1pβj<1\sum_{i=1}^q \alpha_i + \sum_{j=1}^p \beta_j < 1 ensures the variance process is covariance stationary.
  • Unconditional variance: E[σt2]=σˉ2=ω/(1αiβj)\mathbb{E}[\sigma_t^2] = \bar{\sigma}^2 = \omega / (1 - \sum \alpha_i - \sum \beta_j), which exists only when the stationarity condition holds.

GARCH(1,1) is the workhorse: σt2=ω+αεt12+βσt12.\sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2.

Typical estimated values for daily equity returns: α0.08\alpha \approx 0.08, β0.90\beta \approx 0.90, giving α+β0.98\alpha + \beta \approx 0.98 — high persistence. The half-life of a variance shock is ln(0.5)/ln(α+β)34\ln(0.5) / \ln(\alpha + \beta) \approx 34 days for these parameters.

Volatility mean-reversion. Write σt2=σˉ2+(α+β)(σt12σˉ2)+α(εt12σt12)\sigma_t^2 = \bar{\sigma}^2 + (\alpha + \beta)(\sigma_{t-1}^2 - \bar{\sigma}^2) + \alpha(\varepsilon_{t-1}^2 - \sigma_{t-1}^2). The term (α+β)h(\alpha + \beta)^h governs mean-reversion speed; for α+β<1\alpha + \beta < 1 it decays geometrically.

Maximum likelihood estimation. Assume ztN(0,1)z_t \sim \mathcal{N}(0,1). The log-likelihood is:

(θ)=12t=1T[ln(2π)+lnσt2+εt2σt2].\ell(\theta) = -\frac{1}{2}\sum_{t=1}^T \left[\ln(2\pi) + \ln \sigma_t^2 + \frac{\varepsilon_t^2}{\sigma_t^2}\right].

This is maximised numerically (BFGS or Nelder-Mead). Initialise σ12=r2ˉ\sigma_1^2 = \bar{r^2} (sample variance). Student-tt innovations — adding a degrees-of-freedom parameter ν\nu — better fit the heavy tails of equity returns and are often preferred.


Implementation

"""
ARIMA and GARCH estimation for financial time series.

Assumptions:
- Input is a pandas Series of daily log-returns (continuously compounded)
- Returns in decimal form (not percent)
- GARCH innovations follow standard normal (extendable to Student-t)
"""

from __future__ import annotations

import warnings
import numpy as np
import pandas as pd
from scipy.optimize import minimize
from statsmodels.tsa.stattools import adfuller, acf, pacf
from statsmodels.tsa.arima.model import ARIMA


def adf_test(series: pd.Series, max_lags: int = 10) -> dict:
    """
    Augmented Dickey-Fuller unit root test.
    Returns test statistic, p-value, critical values, and verdict.
    """
    result = adfuller(series.dropna(), maxlag=max_lags, autolag="AIC")
    return {
        "test_statistic": result[0],
        "p_value": result[1],
        "lags_used": result[2],
        "critical_values": result[4],
        "stationary": result[1] < 0.05,
    }


def select_arima_order(
    series: pd.Series,
    p_max: int = 5,
    q_max: int = 5,
    d: int = 0,
    criterion: str = "bic",
) -> tuple[int, int, int]:
    """
    Grid search over ARIMA(p, d, q) orders, minimising AIC or BIC.
    Returns (p, d, q) with lowest criterion value.
    """
    best_score = np.inf
    best_order = (0, d, 0)

    for p in range(p_max + 1):
        for q in range(q_max + 1):
            if p == 0 and q == 0:
                continue
            try:
                with warnings.catch_warnings():
                    warnings.simplefilter("ignore")
                    model = ARIMA(series, order=(p, d, q)).fit()
                score = model.aic if criterion == "aic" else model.bic
                if score < best_score:
                    best_score = score
                    best_order = (p, d, q)
            except Exception:
                continue

    return best_order


class GARCH11:
    """
    GARCH(1,1) model estimated by maximum likelihood.

    Model:
        r_t = mu + eps_t
        eps_t = sigma_t * z_t,  z_t ~ N(0,1)
        sigma_t^2 = omega + alpha * eps_{t-1}^2 + beta * sigma_{t-1}^2

    Constraints: omega > 0, alpha >= 0, beta >= 0, alpha + beta < 1.
    """

    def __init__(self) -> None:
        self.omega: float | None = None
        self.alpha: float | None = None
        self.beta: float | None = None
        self.mu: float | None = None
        self._sigmas: np.ndarray | None = None

    def _neg_log_likelihood(self, params: np.ndarray, returns: np.ndarray) -> float:
        mu, omega, alpha, beta = params
        if omega <= 0 or alpha < 0 or beta < 0 or alpha + beta >= 1:
            return 1e10

        T = len(returns)
        eps = returns - mu
        sigma2 = np.empty(T)
        sigma2[0] = np.var(returns)   # initialise at sample variance

        for t in range(1, T):
            sigma2[t] = omega + alpha * eps[t - 1] ** 2 + beta * sigma2[t - 1]

        # Gaussian log-likelihood
        ll = -0.5 * np.sum(np.log(2 * np.pi * sigma2) + eps ** 2 / sigma2)
        return -ll

    def fit(self, returns: pd.Series) -> "GARCH11":
        """
        Fit GARCH(1,1) by MLE. Initialise from sample moments.
        """
        r = returns.values
        sample_var = np.var(r)

        # Initial parameter vector: [mu, omega, alpha, beta]
        x0 = [np.mean(r), sample_var * 0.05, 0.08, 0.88]
        bounds = [
            (None, None),   # mu: unrestricted
            (1e-8, None),   # omega > 0
            (1e-8, 0.5),    # alpha in (0, 0.5)
            (1e-8, 0.999),  # beta in (0, 1)
        ]

        result = minimize(
            self._neg_log_likelihood,
            x0,
            args=(r,),
            method="L-BFGS-B",
            bounds=bounds,
            options={"maxiter": 1000, "ftol": 1e-9},
        )

        if not result.success:
            raise RuntimeError(f"GARCH optimisation failed: {result.message}")

        self.mu, self.omega, self.alpha, self.beta = result.x
        self._sigmas = self._compute_sigmas(r)
        return self

    def _compute_sigmas(self, r: np.ndarray) -> np.ndarray:
        eps = r - self.mu
        T = len(r)
        sigma2 = np.empty(T)
        sigma2[0] = np.var(r)
        for t in range(1, T):
            sigma2[t] = self.omega + self.alpha * eps[t - 1] ** 2 + self.beta * sigma2[t - 1]
        return np.sqrt(sigma2)

    @property
    def unconditional_vol(self) -> float:
        """Annualised unconditional volatility."""
        var = self.omega / (1.0 - self.alpha - self.beta)
        return np.sqrt(var * 252)

    @property
    def persistence(self) -> float:
        """alpha + beta: speed of mean-reversion in variance."""
        return self.alpha + self.beta

    @property
    def vol_halflife_days(self) -> float:
        """Days for a variance shock to halve in magnitude."""
        return np.log(0.5) / np.log(self.persistence)

    def forecast(self, h: int) -> np.ndarray:
        """
        h-step-ahead annualised volatility forecast.
        Uses the formula: E[sigma_{T+h}^2] = sigma_bar^2 + (alpha+beta)^h * (sigma_T^2 - sigma_bar^2)
        """
        if self._sigmas is None:
            raise RuntimeError("Call fit() first.")

        sigma_bar2 = self.omega / (1.0 - self.alpha - self.beta)
        sigma_T2 = self._sigmas[-1] ** 2
        horizons = np.arange(1, h + 1)
        forecasts = sigma_bar2 + self.persistence ** horizons * (sigma_T2 - sigma_bar2)
        return np.sqrt(forecasts * 252)   # annualised

    @property
    def conditional_vols(self) -> pd.Series | None:
        """Return fitted conditional volatility series (annualised)."""
        if self._sigmas is None:
            return None
        return pd.Series(self._sigmas * np.sqrt(252))

Validation

ADF test. Apply to the S&P 500 daily log-return series (any liquid period):

  • rtr_t: ADF statistic typically 30\approx -30, p-value 0.000\approx 0.000 → reject unit root (stationary, as expected).
  • lnPt\ln P_t: ADF statistic typically 1.5-1.5 to 2.0-2.0, p-value >0.10> 0.10 → fail to reject unit root (non-stationary).

ARIMA on daily returns. ACF and PACF of daily S&P 500 returns show almost no significant lags (typical ACF(k)<0.05|ACF(k)| < 0.05). BIC-optimal order is usually ARIMA(0,0,0) — a white noise process. This confirms the near-efficiency of major equity indices at daily frequency.

GARCH(1,1). Typical parameter estimates for S&P 500 daily log-returns (1990–2022):

  • μ^0.00025\hat{\mu} \approx 0.00025 (daily), ω^2e-6\hat{\omega} \approx 2\text{e-}6, α^0.08\hat{\alpha} \approx 0.08, β^0.90\hat{\beta} \approx 0.90.
  • Persistence 0.98\approx 0.98; unconditional annualised vol 16%\approx 16\%.
  • ACF of standardised residuals z^t=ε^t/σ^t\hat{z}_t = \hat{\varepsilon}_t / \hat{\sigma}_t should be near zero — compare to ACF of raw ε^t2\hat{\varepsilon}_t^2 which shows strong autocorrelation at lags 1–20 before GARCH fitting.

Limitations

ARIMA: Near-Zero Predictability of Returns

ARIMA models are almost uniformly uninformative for daily equity returns — the efficient market hypothesis operates as a competitive force removing exploitable serial dependence. They are more useful for:

  • Spreads and yield differences (mean-reverting by construction)
  • High-frequency microstructure signals (bid-ask bounce creates negative autocorrelation)
  • Commodity prices with documented seasonal patterns

GARCH: Normal Innovations Are Too Thin

GARCH(1,1) with Gaussian innovations systematically underestimates tail probability. The kurtosis of standardised residuals z^t\hat{z}_t is typically 4–6 (excess kurtosis 1–3), compared to 0 for a Gaussian. Student-tt GARCH or GJR-GARCH (asymmetric, larger response to negative shocks) are standard improvements. For risk management (VaR/ES at 99%), the choice of innovation distribution matters materially.

GARCH: No Leverage Effect

The standard GARCH(1,1) treats positive and negative shocks symmetrically: εt12\varepsilon_{t-1}^2 does not depend on the sign of εt1\varepsilon_{t-1}. In equity markets, volatility rises more after negative returns (leverage effect: falling prices increase financial leverage). GJR-GARCH (Glosten, Jagannathan, Runkle 1993) adds an asymmetric term:

σt2=ω+(α+γ1εt1<0)εt12+βσt12.\sigma_t^2 = \omega + (\alpha + \gamma \mathbf{1}_{\varepsilon_{t-1} < 0}) \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2.

Typically γ^>0\hat{\gamma} > 0 in equity data, confirming the leverage effect.

Non-stationarity of GARCH Parameters

GARCH parameters estimated over long horizons may not be stable: the volatility regime during the 2008 crisis differs from 2013–2019 low-vol periods. Rolling estimation or regime-switching GARCH (Hamilton-Susmel) can capture structural breaks, but at the cost of additional complexity and estimation uncertainty.


Interview Angle

L1. What is volatility clustering? Write the GARCH(1,1) conditional variance equation. What constraint is required for stationarity, and what is the unconditional variance? How would you estimate GARCH parameters in practice?

L2. Derive the hh-step-ahead GARCH(1,1) variance forecast. Given α^=0.08\hat{\alpha} = 0.08, β^=0.90\hat{\beta} = 0.90, compute the half-life of a variance shock. Why is the GARCH(1,1) Gaussian log-likelihood used rather than OLS? Explain why ARIMA is almost useless for daily equity returns but may be relevant for high-frequency data.

GARCH h-step forecast. Write σt+12σˉ2=(α+β)(σt2σˉ2)+α(εt2σt2)\sigma_{t+1}^2 - \bar{\sigma}^2 = (\alpha + \beta)(\sigma_t^2 - \bar{\sigma}^2) + \alpha(\varepsilon_t^2 - \sigma_t^2). Since E[εt2σt2Ft]=0\mathbb{E}[\varepsilon_t^2 - \sigma_t^2 | \mathcal{F}_t] = 0, iterating: E[σt+h2Ft]=σˉ2+(α+β)h(σt2σˉ2)\mathbb{E}[\sigma_{t+h}^2 | \mathcal{F}_t] = \bar{\sigma}^2 + (\alpha+\beta)^h(\sigma_t^2 - \bar{\sigma}^2). For α+β=0.98\alpha+\beta = 0.98: half-life = ln(0.5)/ln(0.98)34\ln(0.5)/\ln(0.98) \approx 34 days.

L3. Compare GARCH, GJR-GARCH, and EGARCH. How does the leverage effect appear in each? How would you use a fitted GARCH model for VaR estimation (historical simulation with GARCH scaling vs. parametric GARCH-VaR)? What are the limitations of GARCH for option pricing compared to stochastic vol models (Heston)?

Read the theory? Verify your understanding.

Take the Quiz