Brownian Bridge™

Setup

Why Time Series Models Matter in Quant Finance

Time series methods appear in two distinct roles on a quant desk:

Return prediction: Can past prices or returns predict future ones? ARIMA provides a rigorous framework for modelling serial dependence in levels and returns — and usually confirms that equity daily returns have almost none.
Volatility forecasting: Volatility is serially correlated even when returns are not. GARCH models capture volatility clustering — the empirical observation that large moves tend to cluster in time — and are essential for option pricing under stochastic vol, VaR estimation, and risk-adjusted sizing.

Conventions throughout. All rates of return are continuously compounded: $r_t = \ln(P_t / P_{t-1})$ . Volatility $\sigma_t$ is annualised unless stated otherwise. Daily returns assumed to have 252 trading days per year. All series assumed to be observed at equally spaced intervals.

Theory

1. Stationarity

A time series $(r_t)_{t \in \mathbb{Z}}$ is weakly stationary (covariance stationary) if:

$\mathbb{E}[r_t] = \mu < \infty$ for all $t$ ,
$\text{Var}(r_t) = \sigma^2 < \infty$ for all $t$ ,
$\text{Cov}(r_t, r_{t-k}) = \gamma(k)$ depends only on lag $k$ , not on $t$ .

Why it matters. Statistical inference on time series requires stationarity: parameter estimates have no asymptotic meaning for non-stationary series. Log-prices $\ln P_t$ are typically non-stationary (unit root); log-returns $r_t = \Delta \ln P_t$ are typically stationary.

Unit root test. The Augmented Dickey-Fuller (ADF) test tests $H_0$ : unit root present (non-stationary) against $H_A$ : stationary. Reject $H_0$ at the 5% level if the ADF test statistic is below the critical value (approximately $-2.86$ for no constant, $-3.43$ for constant and trend).

2. ARMA Models

AR( $p$ ) — Autoregressive. The series depends linearly on its own past $p$ values:

$r_t = c + \phi_1 r_{t-1} + \cdots + \phi_p r_{t-p} + \varepsilon_t, \qquad \varepsilon_t \overset{\text{iid}}{\sim} (0, \sigma^2).$

Using the lag operator $L$ ( $L^k r_t = r_{t-k}$ ), write as $\Phi(L) r_t = c + \varepsilon_t$ where $\Phi(z) = 1 - \phi_1 z - \cdots - \phi_p z^p$ . The process is stationary iff all roots of $\Phi(z) = 0$ lie outside the unit circle.

MA( $q$ ) — Moving Average. The series is a linear combination of current and past shocks:

$r_t = \mu + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \cdots + \theta_q \varepsilon_{t-q} = \mu + \Theta(L)\varepsilon_t.$

An MA( $q$ ) process is always stationary. It is invertible (representable as an infinite AR) iff all roots of $\Theta(z) = 0$ lie outside the unit circle.

ARMA( $p, q$ ): $\Phi(L) r_t = c + \Theta(L) \varepsilon_t.$

ARIMA( $p, d, q$ ): Apply differencing $d$ times before fitting ARMA( $p, q$ ). For $d=1$ : model applies to $\Delta r_t = r_t - r_{t-1}$ . For daily equity returns, $d=0$ is appropriate (returns are already stationary). For log-prices, $d=1$ produces returns.

Model selection. Use information criteria:

AIC: $-2\ell(\hat{\theta}) + 2k$ , where $\ell$ is log-likelihood and $k$ is number of parameters.
BIC: $-2\ell(\hat{\theta}) + k\ln T$ .

BIC penalises complexity more heavily and is preferred when the true model is parsimonious. Inspect ACF (autocorrelation function) and PACF (partial ACF) to guide $p$ and $q$ choices.

3. GARCH: Generalised ARCH

Motivation. Equity returns $r_t$ are approximately serially uncorrelated (ACF of $r_t$ near zero), but $r_t^2$ has significant positive autocorrelation at many lags. This is volatility clustering — Mandelbrot (1963) observed that "large changes tend to be followed by large changes, of either sign." ARCH/GARCH models this explicitly.

GARCH( $p, q$ ) — Bollerslev (1986). Decompose the return as:

$r_t = \mu + \varepsilon_t, \qquad \varepsilon_t = \sigma_t z_t, \qquad z_t \overset{\text{iid}}{\sim} (0,1),$

where the conditional variance $\sigma_t^2$ follows:

$\sigma_t^2 = \omega + \sum_{i=1}^q \alpha_i \varepsilon_{t-i}^2 + \sum_{j=1}^p \beta_j \sigma_{t-j}^2.$

Parameters and constraints.

$\omega > 0$ , $\alpha_i \geq 0$ , $\beta_j \geq 0$ — ensures $\sigma_t^2 > 0$ a.s.
Stationarity: $\sum_{i=1}^q \alpha_i + \sum_{j=1}^p \beta_j < 1$ ensures the variance process is covariance stationary.
Unconditional variance: $\mathbb{E}[\sigma_t^2] = \bar{\sigma}^2 = \omega / (1 - \sum \alpha_i - \sum \beta_j)$ , which exists only when the stationarity condition holds.

GARCH(1,1) is the workhorse: $\sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2.$

Typical estimated values for daily equity returns: $\alpha \approx 0.08$ , $\beta \approx 0.90$ , giving $\alpha + \beta \approx 0.98$ — high persistence. The half-life of a variance shock is $\ln(0.5) / \ln(\alpha + \beta) \approx 34$ days for these parameters.

Volatility mean-reversion. Write $\sigma_t^2 = \bar{\sigma}^2 + (\alpha + \beta)(\sigma_{t-1}^2 - \bar{\sigma}^2) + \alpha(\varepsilon_{t-1}^2 - \sigma_{t-1}^2)$ . The term $(\alpha + \beta)^h$ governs mean-reversion speed; for $\alpha + \beta < 1$ it decays geometrically.

Maximum likelihood estimation. Assume $z_t \sim \mathcal{N}(0,1)$ . The log-likelihood is:

$\ell(\theta) = -\frac{1}{2}\sum_{t=1}^T \left[\ln(2\pi) + \ln \sigma_t^2 + \frac{\varepsilon_t^2}{\sigma_t^2}\right].$

This is maximised numerically (BFGS or Nelder-Mead). Initialise $\sigma_1^2 = \bar{r^2}$ (sample variance). Student- $t$ innovations — adding a degrees-of-freedom parameter $\nu$ — better fit the heavy tails of equity returns and are often preferred.

Implementation

"""
ARIMA and GARCH estimation for financial time series.

Assumptions:
- Input is a pandas Series of daily log-returns (continuously compounded)
- Returns in decimal form (not percent)
- GARCH innovations follow standard normal (extendable to Student-t)
"""

from __future__ import annotations

import warnings
import numpy as np
import pandas as pd
from scipy.optimize import minimize
from statsmodels.tsa.stattools import adfuller, acf, pacf
from statsmodels.tsa.arima.model import ARIMA


def adf_test(series: pd.Series, max_lags: int = 10) -> dict:
    """
    Augmented Dickey-Fuller unit root test.
    Returns test statistic, p-value, critical values, and verdict.
    """
    result = adfuller(series.dropna(), maxlag=max_lags, autolag="AIC")
    return {
        "test_statistic": result[0],
        "p_value": result[1],
        "lags_used": result[2],
        "critical_values": result[4],
        "stationary": result[1] < 0.05,
    }


def select_arima_order(
    series: pd.Series,
    p_max: int = 5,
    q_max: int = 5,
    d: int = 0,
    criterion: str = "bic",
) -> tuple[int, int, int]:
    """
    Grid search over ARIMA(p, d, q) orders, minimising AIC or BIC.
    Returns (p, d, q) with lowest criterion value.
    """
    best_score = np.inf
    best_order = (0, d, 0)

    for p in range(p_max + 1):
        for q in range(q_max + 1):
            if p == 0 and q == 0:
                continue
            try:
                with warnings.catch_warnings():
                    warnings.simplefilter("ignore")
                    model = ARIMA(series, order=(p, d, q)).fit()
                score = model.aic if criterion == "aic" else model.bic
                if score < best_score:
                    best_score = score
                    best_order = (p, d, q)
            except Exception:
                continue

    return best_order


class GARCH11:
    """
    GARCH(1,1) model estimated by maximum likelihood.

    Model:
        r_t = mu + eps_t
        eps_t = sigma_t * z_t,  z_t ~ N(0,1)
        sigma_t^2 = omega + alpha * eps_{t-1}^2 + beta * sigma_{t-1}^2

    Constraints: omega > 0, alpha >= 0, beta >= 0, alpha + beta < 1.
    """

    def __init__(self) -> None:
        self.omega: float | None = None
        self.alpha: float | None = None
        self.beta: float | None = None
        self.mu: float | None = None
        self._sigmas: np.ndarray | None = None

    def _neg_log_likelihood(self, params: np.ndarray, returns: np.ndarray) -> float:
        mu, omega, alpha, beta = params
        if omega <= 0 or alpha < 0 or beta < 0 or alpha + beta >= 1:
            return 1e10

        T = len(returns)
        eps = returns - mu
        sigma2 = np.empty(T)
        sigma2[0] = np.var(returns)   # initialise at sample variance

        for t in range(1, T):
            sigma2[t] = omega + alpha * eps[t - 1] ** 2 + beta * sigma2[t - 1]

        # Gaussian log-likelihood
        ll = -0.5 * np.sum(np.log(2 * np.pi * sigma2) + eps ** 2 / sigma2)
        return -ll

    def fit(self, returns: pd.Series) -> "GARCH11":
        """
        Fit GARCH(1,1) by MLE. Initialise from sample moments.
        """
        r = returns.values
        sample_var = np.var(r)

        # Initial parameter vector: [mu, omega, alpha, beta]
        x0 = [np.mean(r), sample_var * 0.05, 0.08, 0.88]
        bounds = [
            (None, None),   # mu: unrestricted
            (1e-8, None),   # omega > 0
            (1e-8, 0.5),    # alpha in (0, 0.5)
            (1e-8, 0.999),  # beta in (0, 1)
        ]

        result = minimize(
            self._neg_log_likelihood,
            x0,
            args=(r,),
            method="L-BFGS-B",
            bounds=bounds,
            options={"maxiter": 1000, "ftol": 1e-9},
        )

        if not result.success:
            raise RuntimeError(f"GARCH optimisation failed: {result.message}")

        self.mu, self.omega, self.alpha, self.beta = result.x
        self._sigmas = self._compute_sigmas(r)
        return self

    def _compute_sigmas(self, r: np.ndarray) -> np.ndarray:
        eps = r - self.mu
        T = len(r)
        sigma2 = np.empty(T)
        sigma2[0] = np.var(r)
        for t in range(1, T):
            sigma2[t] = self.omega + self.alpha * eps[t - 1] ** 2 + self.beta * sigma2[t - 1]
        return np.sqrt(sigma2)

    @property
    def unconditional_vol(self) -> float:
        """Annualised unconditional volatility."""
        var = self.omega / (1.0 - self.alpha - self.beta)
        return np.sqrt(var * 252)

    @property
    def persistence(self) -> float:
        """alpha + beta: speed of mean-reversion in variance."""
        return self.alpha + self.beta

    @property
    def vol_halflife_days(self) -> float:
        """Days for a variance shock to halve in magnitude."""
        return np.log(0.5) / np.log(self.persistence)

    def forecast(self, h: int) -> np.ndarray:
        """
        h-step-ahead annualised volatility forecast.
        Uses the formula: E[sigma_{T+h}^2] = sigma_bar^2 + (alpha+beta)^h * (sigma_T^2 - sigma_bar^2)
        """
        if self._sigmas is None:
            raise RuntimeError("Call fit() first.")

        sigma_bar2 = self.omega / (1.0 - self.alpha - self.beta)
        sigma_T2 = self._sigmas[-1] ** 2
        horizons = np.arange(1, h + 1)
        forecasts = sigma_bar2 + self.persistence ** horizons * (sigma_T2 - sigma_bar2)
        return np.sqrt(forecasts * 252)   # annualised

    @property
    def conditional_vols(self) -> pd.Series | None:
        """Return fitted conditional volatility series (annualised)."""
        if self._sigmas is None:
            return None
        return pd.Series(self._sigmas * np.sqrt(252))

Validation

ADF test. Apply to the S&P 500 daily log-return series (any liquid period):

$r_t$ : ADF statistic typically $\approx -30$ , p-value $\approx 0.000$ → reject unit root (stationary, as expected).
$\ln P_t$ : ADF statistic typically $-1.5$ to $-2.0$ , p-value $> 0.10$ → fail to reject unit root (non-stationary).

ARIMA on daily returns. ACF and PACF of daily S&P 500 returns show almost no significant lags (typical $|ACF(k)| < 0.05$ ). BIC-optimal order is usually ARIMA(0,0,0) — a white noise process. This confirms the near-efficiency of major equity indices at daily frequency.

GARCH(1,1). Typical parameter estimates for S&P 500 daily log-returns (1990–2022):

$\hat{\mu} \approx 0.00025$ (daily), $\hat{\omega} \approx 2\text{e-}6$ , $\hat{\alpha} \approx 0.08$ , $\hat{\beta} \approx 0.90$ .
Persistence $\approx 0.98$ ; unconditional annualised vol $\approx 16\%$ .
ACF of standardised residuals $\hat{z}_t = \hat{\varepsilon}_t / \hat{\sigma}_t$ should be near zero — compare to ACF of raw $\hat{\varepsilon}_t^2$ which shows strong autocorrelation at lags 1–20 before GARCH fitting.

Limitations

ARIMA: Near-Zero Predictability of Returns

ARIMA models are almost uniformly uninformative for daily equity returns — the efficient market hypothesis operates as a competitive force removing exploitable serial dependence. They are more useful for:

Spreads and yield differences (mean-reverting by construction)
High-frequency microstructure signals (bid-ask bounce creates negative autocorrelation)
Commodity prices with documented seasonal patterns

GARCH: Normal Innovations Are Too Thin

GARCH(1,1) with Gaussian innovations systematically underestimates tail probability. The kurtosis of standardised residuals $\hat{z}_t$ is typically 4–6 (excess kurtosis 1–3), compared to 0 for a Gaussian. Student- $t$ GARCH or GJR-GARCH (asymmetric, larger response to negative shocks) are standard improvements. For risk management (VaR/ES at 99%), the choice of innovation distribution matters materially.

GARCH: No Leverage Effect

The standard GARCH(1,1) treats positive and negative shocks symmetrically: $\varepsilon_{t-1}^2$ does not depend on the sign of $\varepsilon_{t-1}$ . In equity markets, volatility rises more after negative returns (leverage effect: falling prices increase financial leverage). GJR-GARCH (Glosten, Jagannathan, Runkle 1993) adds an asymmetric term:

$\sigma_t^2 = \omega + (\alpha + \gamma \mathbf{1}_{\varepsilon_{t-1} < 0}) \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2.$

Typically $\hat{\gamma} > 0$ in equity data, confirming the leverage effect.

Non-stationarity of GARCH Parameters

GARCH parameters estimated over long horizons may not be stable: the volatility regime during the 2008 crisis differs from 2013–2019 low-vol periods. Rolling estimation or regime-switching GARCH (Hamilton-Susmel) can capture structural breaks, but at the cost of additional complexity and estimation uncertainty.

Interview Angle

L1. What is volatility clustering? Write the GARCH(1,1) conditional variance equation. What constraint is required for stationarity, and what is the unconditional variance? How would you estimate GARCH parameters in practice?

L2. Derive the $h$ -step-ahead GARCH(1,1) variance forecast. Given $\hat{\alpha} = 0.08$ , $\hat{\beta} = 0.90$ , compute the half-life of a variance shock. Why is the GARCH(1,1) Gaussian log-likelihood used rather than OLS? Explain why ARIMA is almost useless for daily equity returns but may be relevant for high-frequency data.

GARCH h-step forecast. Write $\sigma_{t+1}^2 - \bar{\sigma}^2 = (\alpha + \beta)(\sigma_t^2 - \bar{\sigma}^2) + \alpha(\varepsilon_t^2 - \sigma_t^2)$ . Since $\mathbb{E}[\varepsilon_t^2 - \sigma_t^2 | \mathcal{F}_t] = 0$ , iterating: $\mathbb{E}[\sigma_{t+h}^2 | \mathcal{F}_t] = \bar{\sigma}^2 + (\alpha+\beta)^h(\sigma_t^2 - \bar{\sigma}^2)$ . For $\alpha+\beta = 0.98$ : half-life = $\ln(0.5)/\ln(0.98) \approx 34$ days.

L3. Compare GARCH, GJR-GARCH, and EGARCH. How does the leverage effect appear in each? How would you use a fitted GARCH model for VaR estimation (historical simulation with GARCH scaling vs. parametric GARCH-VaR)? What are the limitations of GARCH for option pricing compared to stochastic vol models (Heston)?