Market MicrostructureOptimal ExecutionMarket ImpactStochastic ControlAlmgren-Chriss

Almgren-Chriss Optimal Execution

Module 3 of 425 min readLevel: Hard

Setup

The Execution Problem

A portfolio manager needs to sell XX shares of a stock over a time horizon [0,T][0, T]. Selling too fast incurs large market impact costs (moving the price against themselves). Selling too slowly leaves the position exposed to price risk for longer. This is the optimal execution problem: find a trading schedule that balances market impact against timing risk.

This is a ubiquitous problem in practice: unwinding a large position post-trade decision, rebalancing a portfolio after a factor model update, or executing a delta hedge for a large options position. The Almgren-Chriss (2001) framework is the standard model for this problem on institutional trading desks.

Conventions and Assumptions

  • Initial position: X>0X > 0 shares to sell, X0=XX_0 = X.
  • Time grid: discrete times 0=t0<t1<<tN=T0 = t_0 < t_1 < \cdots < t_N = T with equal spacing τ=T/N\tau = T/N.
  • Shares sold in period kk: nk=Xk1Xk0n_k = X_{k-1} - X_k \geq 0 (non-negative, selling only).
  • Trade rate: vk=nk/τv_k = n_k / \tau (shares per unit time).
  • Mid-price dynamics in the absence of trading: Sk0=Sk10+στξkS_k^0 = S_{k-1}^0 + \sigma\sqrt{\tau}\, \xi_k where ξki.i.d.N(0,1)\xi_k \stackrel{\text{i.i.d.}}{\sim} \mathcal{N}(0,1).
  • Volatility σ\sigma: annualised. Daily vol is σ1/252\sigma\sqrt{1/252} for equities.
  • Rates are zero (P&L is measured in dollar terms, no discounting).

Market Impact Model

Permanent and Temporary Impact

Almgren-Chriss decompose market impact into two components:

Permanent impact: trades shift the mid-price permanently. Every share sold at time kk reduces the mid-price for all future periods. The cumulative permanent impact of selling n1,,nkn_1, \ldots, n_k shares is:

ΔSkperm=g ⁣(nkτ)τ=g(vk)τ,\Delta S_k^{\mathrm{perm}} = -g\!\left(\frac{n_k}{\tau}\right) \cdot \tau = -g(v_k)\cdot\tau,

where g(v)g(v) is the permanent impact function (shares per time → price change). For the linear model: g(v)=γvg(v) = \gamma v, so selling at rate vv permanently moves the price by γvτ-\gamma v \cdot \tau per period.

Temporary impact: the execution price in period kk differs from the (already impacted) mid by an additional amount due to consuming liquidity immediately. This cost is paid once and does not affect future mid-prices:

Pkexec=S~kh ⁣(nkτ),P_k^{\mathrm{exec}} = \tilde{S}_k - h\!\left(\frac{n_k}{\tau}\right),

where S~k\tilde{S}_k is the mid-price at time kk (already shifted by permanent impact), and h(v)h(v) is the temporary impact function. For the linear model: h(v)=ϵsgn(v)+ηvh(v) = \epsilon\, \mathrm{sgn}(v) + \eta v, where ϵ\epsilon is a fixed half-spread cost and ηv\eta v is the linear temporary impact.

The Full Price Process

Let S~k\tilde{S}_k be the mid-price after permanent impact. Starting from S0S_0:

S~k=S0+στj=1kξjγj=1knj.\tilde{S}_k = S_0 + \sigma\sqrt{\tau}\sum_{j=1}^k \xi_j - \gamma \sum_{j=1}^k n_j.

The total proceeds from selling XX shares:

Proceeds=k=1NnkPkexec=k=1Nnk ⁣[S~kh(nk/τ)].\mathrm{Proceeds} = \sum_{k=1}^N n_k P_k^{\mathrm{exec}} = \sum_{k=1}^N n_k \!\left[\tilde{S}_k - h(n_k/\tau)\right].

The shortfall (cost of execution relative to the decision price S0S_0):

IS=S0XProceeds=γk=1Nnkj=1knjpermanentimpact+ηk=1Nnk2τtemporaryimpact+στk=1Nnkj=1kξjtimingrisk.\mathrm{IS} = S_0 X - \mathrm{Proceeds} = \underbrace{\gamma \sum_{k=1}^N n_k \sum_{j=1}^k n_j}_{\mathrm{permanent\,impact}} + \underbrace{\eta \sum_{k=1}^N \frac{n_k^2}{\tau}}_{\mathrm{temporary\,impact}} + \underbrace{\sigma\sqrt{\tau}\sum_{k=1}^N n_k \sum_{j=1}^k \xi_j}_{\mathrm{timing\,risk}}.

The first two terms are deterministic costs; the last is random (timing risk from mid-price moves while the position is open).


Theory: Optimal Execution

Objective Function

Almgren-Chriss optimise the mean-variance objective over the implementation shortfall:

U=E[IS]+λ2Var[IS],U = \mathbb{E}[\mathrm{IS}] + \frac{\lambda}{2} \mathrm{Var}[\mathrm{IS}],

where λ0\lambda \geq 0 is the risk-aversion parameter (units: inverse variance, or equivalently, the penalty per unit of variance). Large λ\lambda penalises timing risk heavily, favouring fast execution; small λ\lambda accepts more risk in return for lower market impact costs.

Expected shortfall:

E[IS]=γk=1Nnkj=1knj+ητk=1Nnk2=γ2X2+ητk=1Nnk2,\mathbb{E}[\mathrm{IS}] = \gamma \sum_{k=1}^N n_k \sum_{j=1}^k n_j + \frac{\eta}{\tau} \sum_{k=1}^N n_k^2 = \frac{\gamma}{2}X^2 + \frac{\eta}{\tau}\sum_{k=1}^N n_k^2,

using the identity knkj=1knj=12[(knk)2+knk2]12knk2\sum_k n_k \sum_{j=1}^k n_j = \frac{1}{2}\bigl[(\sum_k n_k)^2 + \sum_k n_k^2\bigr] - \frac{1}{2}\sum_k n_k^2 and noting knk=X\sum_k n_k = X. The first term is a constant (independent of schedule), so minimising expected shortfall alone reduces to minimising knk2\sum_k n_k^2, which favours spreading trades as evenly as possible (TWAP).

Variance of shortfall:

Var[IS]=σ2τk=1NXk2,\mathrm{Var}[\mathrm{IS}] = \sigma^2 \tau \sum_{k=1}^N X_k^2,

where Xk=Xj=1knjX_k = X - \sum_{j=1}^k n_j is the remaining inventory after period kk. This follows because the timing risk is στknkj=1kξj=στjξjXj\sigma\sqrt{\tau}\sum_k n_k \sum_{j=1}^k\xi_j = \sigma\sqrt{\tau}\sum_j \xi_j X_j, and the ξj\xi_j are independent.

Continuous-Time Formulation

In continuous time with selling rate v(t)v(t) (shares per unit time):

E[IS]=γ2X2+η0Tv(t)2dt,\mathbb{E}[\mathrm{IS}] = \frac{\gamma}{2}X^2 + \eta\int_0^T v(t)^2\, dt,

Var[IS]=σ20Tx(t)2dt,\mathrm{Var}[\mathrm{IS}] = \sigma^2 \int_0^T x(t)^2\, dt,

where x(t)=X0tv(s)dsx(t) = X - \int_0^t v(s)\, ds is the remaining inventory. The objective is:

minv(){η0Tv(t)2dt+λσ220Tx(t)2dt},\min_{v(\cdot)} \left\{\eta\int_0^T v(t)^2\, dt + \frac{\lambda\sigma^2}{2}\int_0^T x(t)^2\, dt\right\},

subject to x(0)=Xx(0) = X, x(T)=0x(T) = 0, v(t)=x˙(t)0v(t) = -\dot{x}(t) \geq 0.

Closed-Form Solution via Calculus of Variations

The Euler-Lagrange equation for the functional 0T[ηv2+λσ22x2]dt\int_0^T [\eta v^2 + \frac{\lambda\sigma^2}{2}x^2]\, dt is:

2ηx¨(t)=λσ2x(t),2\eta \ddot{x}(t) = \lambda\sigma^2 x(t),

or equivalently x¨=κ2x\ddot{x} = \kappa^2 x where κ=λσ2/(2η)\kappa = \sqrt{\lambda\sigma^2/(2\eta)}. This is a linear second-order ODE with solution:

x(t)=Asinh(κ(Tt))+Bcosh(κ(Tt)).x(t) = A\sinh(\kappa(T-t)) + B\cosh(\kappa(T-t)).

Applying boundary conditions x(0)=Xx(0) = X, x(T)=0x(T) = 0:

x(t)=Xsinh(κ(Tt))sinh(κT).x^*(t) = X \cdot \frac{\sinh(\kappa(T-t))}{\sinh(\kappa T)}.

The optimal trading rate:

v(t)=x˙(t)=Xκcosh(κ(Tt))sinh(κT).v^*(t) = -\dot{x}^*(t) = X\kappa \cdot \frac{\cosh(\kappa(T-t))}{\sinh(\kappa T)}.

Special Cases

TWAP (λ0\lambda \to 0): As risk aversion vanishes, κ0\kappa \to 0. Using sinh(κu)κu\sinh(\kappa u) \approx \kappa u for small κ\kappa:

x(t)X(1tT),v(t)XT(constant — TWAP).x^*(t) \to X\left(1 - \frac{t}{T}\right), \qquad v^*(t) \to \frac{X}{T} \quad \text{(constant — TWAP)}.

TWAP is optimal when there is no timing risk penalty: trade at a constant rate, spreading market impact evenly.

Aggressive execution (λ\lambda \to \infty): As risk aversion dominates, κ\kappa \to \infty. The solution concentrates at t=0t = 0: v(0+)v^*(0+) \to \infty, x(t)0x^*(t) \to 0 for t>0t > 0. Extreme risk aversion means: sell everything immediately (market order). The market impact cost is irrelevant relative to the timing risk penalty.

Finite λ\lambda: The optimal trajectory is a hyperbolic sine — front-loaded relative to TWAP but not a single market order. More shares are sold early (when the remaining inventory xx is large, so the variance penalty λσ2x2\lambda\sigma^2 x^2 is large) and fewer late.


Efficient Frontier

For different values of λ\lambda, the optimal strategy traces out an efficient frontier in the (Expected Shortfall, Variance) plane:

Var=σ2X2[κTtanh(κT)κTtanh(κT)],E=γX22+ηκX2tanh(κT).\mathrm{Var}^* = \sigma^2 X^2 \left[\frac{\kappa T - \tanh(\kappa T)}{\kappa T \tanh(\kappa T)}\right], \qquad \mathbb{E}^* = \frac{\gamma X^2}{2} + \frac{\eta\kappa X^2}{\tanh(\kappa T)}.

As λ\lambda increases, κ\kappa increases: expected shortfall increases (faster execution is more expensive in impact costs) but variance decreases (less timing risk). Every point on the frontier is optimal for some risk aversion level.


Implementation

import numpy as np
from dataclasses import dataclass

@dataclass
class AlmgrenChrissParams:
    X:     float   # initial position (shares)
    T:     float   # horizon (years)
    sigma: float   # annualised volatility
    eta:   float   # temporary impact coefficient (price per shares-per-time)
    gamma: float   # permanent impact coefficient
    lam:   float   # risk-aversion parameter lambda

def kappa(p: AlmgrenChrissParams) -> float:
    """Decay parameter kappa = sqrt(lambda * sigma^2 / (2 * eta))."""
    return np.sqrt(p.lam * p.sigma**2 / (2.0 * p.eta))

def optimal_trajectory(p: AlmgrenChrissParams, n_steps: int = 100) -> tuple[np.ndarray, np.ndarray]:
    """
    Almgren-Chriss optimal continuous-time trajectory.

    Returns:
        t:    time grid, shape (n_steps+1,)
        x:    remaining inventory at each time, shape (n_steps+1,)
    """
    k  = kappa(p)
    t  = np.linspace(0.0, p.T, n_steps + 1)
    # x*(t) = X * sinh(kappa*(T-t)) / sinh(kappa*T)
    denom = np.sinh(k * p.T)
    if np.abs(denom) < 1e-10:
        # TWAP limit
        x = p.X * (1.0 - t / p.T)
    else:
        x = p.X * np.sinh(k * (p.T - t)) / denom
    return t, x

def optimal_schedule(p: AlmgrenChrissParams, n_steps: int = 20) -> tuple[np.ndarray, np.ndarray]:
    """
    Discrete optimal execution schedule.

    Returns:
        times:  endpoints of trading periods, shape (n_steps+1,)
        trades: shares traded in each period, shape (n_steps,)
    """
    t, x = optimal_trajectory(p, n_steps)
    trades = np.diff(-x)   # n_k = x_{k-1} - x_k (positive = selling)
    return t, trades

def expected_shortfall(p: AlmgrenChrissParams) -> float:
    """Expected implementation shortfall for optimal strategy (continuous-time)."""
    k = kappa(p)
    # E[IS] = gamma/2 * X^2 + eta * kappa * X^2 / tanh(kappa*T)
    if k < 1e-10:
        return p.gamma / 2.0 * p.X**2 + p.eta * p.X**2 / p.T   # TWAP limit
    return p.gamma / 2.0 * p.X**2 + p.eta * k * p.X**2 / np.tanh(k * p.T)

def variance_shortfall(p: AlmgrenChrissParams) -> float:
    """Variance of implementation shortfall for optimal strategy (continuous-time)."""
    k = kappa(p)
    if k < 1e-10:
        return p.sigma**2 * p.X**2 * p.T / 3.0   # TWAP limit
    # Var = sigma^2 * X^2 * (kappa*T - tanh(kappa*T)) / (kappa * T * tanh(kappa*T))
    kT  = k * p.T
    tnh = np.tanh(kT)
    return p.sigma**2 * p.X**2 * (kT - tnh) / (kT * tnh)

def efficient_frontier(
    p_base: AlmgrenChrissParams,
    n_lambda: int = 50
) -> tuple[np.ndarray, np.ndarray]:
    """
    Compute the efficient frontier by varying risk-aversion lambda.

    Returns:
        ev:  expected shortfall values (n_lambda,)
        var: variance values (n_lambda,)
    """
    from copy import replace  # Python 3.13+; use dataclasses.replace otherwise
    import dataclasses
    lambdas = np.logspace(-6, 2, n_lambda)
    ev_arr  = np.zeros(n_lambda)
    var_arr = np.zeros(n_lambda)
    for i, lam in enumerate(lambdas):
        p = dataclasses.replace(p_base, lam=lam)
        ev_arr[i]  = expected_shortfall(p)
        var_arr[i] = variance_shortfall(p)
    return ev_arr, var_arr

Validation

Analytic Checks

import dataclasses

p = AlmgrenChrissParams(X=1e6, T=1.0/252, sigma=0.25, eta=2.5e-7, gamma=2.5e-8, lam=0.0)

# 1. TWAP limit: lambda = 0 gives linear trajectory
t, x = optimal_trajectory(p, n_steps=10)
assert np.allclose(x, p.X * (1.0 - t / p.T), atol=1e-3), "Should be TWAP"

# 2. Constraint: total shares sold = X
_, trades = optimal_schedule(p, n_steps=20)
assert abs(trades.sum() - p.X) < 1.0, "Must liquidate all shares"

# 3. Efficient frontier: expected shortfall increases with lambda
ev, var = efficient_frontier(p, n_lambda=20)
# As lambda increases, kappa increases, faster execution -> higher E[IS]
p_hi_lam = dataclasses.replace(p, lam=1.0)
assert expected_shortfall(p_hi_lam) > expected_shortfall(p), "Higher lambda -> higher E[IS]"
assert variance_shortfall(p_hi_lam) < variance_shortfall(p), "Higher lambda -> lower Var[IS]"
print("All analytic checks passed.")

Limitations

Linear market impact. The Almgren-Chriss model uses linear temporary impact h(v)=ηvh(v) = \eta v. Empirically, market impact is better described by a square-root law: h(v)vh(v) \propto \sqrt{v} (see Module 4). The closed-form solution does not extend to power-law impact functions; numerical solutions are required.

Constant volatility. Assuming constant σ\sigma is restrictive. In practice, volatility is time-varying (intraday U-shape, stochastic vol). A more realistic model uses time-dependent σ(t)\sigma(t), which changes the optimal trading rate — sell faster during low-vol periods, slower during high-vol periods.

No alpha signal. The model assumes zero drift (E[dS]=0\mathbb{E}[dS] = 0). If the trader has a short-lived alpha signal predicting price movement, the optimal schedule is modified: sell faster if the signal predicts a price decline, slower if it predicts appreciation. The Almgren (2003) extension incorporates an alpha decay model.

Discrete market. The continuous-time optimal schedule must be rounded to discrete lot sizes and clocked to the exchange's trade frequency. Discretisation introduces approximation error, particularly for very short horizons where only a few trades occur.

Participation rate constraints. In practice, trades are constrained to a maximum participation rate (e.g., 10–20% of average daily volume) to avoid signalling. The Almgren-Chriss model has no such constraint; a constrained optimisation is needed.

No dark pools or alternative venues. The model treats the market as a single venue with homogeneous liquidity. A multi-venue model (primary exchange, dark pool, crossing network) with different impact characteristics per venue is more realistic.


Interview Angle

L1. What is the implementation shortfall? A portfolio manager decides to sell 1 million shares at a mid-price of $50.00. After execution, the average fill price is $49.90. What is the implementation shortfall in dollar terms and in basis points?

Implementation shortfall (IS): the difference between the "paper portfolio" P&L (at the decision price) and the actual execution P&L:

\mathrm{IS} = S_0 X - \mathrm{Proceeds} = 50.00 \times 10^6 - 49.90 \times 10^6 = \100{,}000. $

In basis points: IS/(S0X)=0.10/50.00=20 bps\mathrm{IS} / (S_0 X) = 0.10 / 50.00 = 20\ \mathrm{bps}. Components: market impact (prices moved down as shares were sold) plus spread cost (selling below the mid). A 20 bps IS on a $50M trade is $100k in cost — significant relative to the transaction value.

L2. Write down the Almgren-Chriss objective function. Why is TWAP the optimal strategy when λ=0\lambda = 0? For λ>0\lambda > 0, derive the Euler-Lagrange equation and state the closed-form trajectory x(t)x^*(t).

Objective: minv{η0Tv2dt+λσ220Tx2dt}\min_v \{\eta\int_0^T v^2\, dt + \frac{\lambda\sigma^2}{2}\int_0^T x^2\, dt\}. At λ=0\lambda = 0: the objective reduces to η0Tv2dt=ηX20T(x˙)2dt/T2\eta\int_0^T v^2\, dt = \eta X^2 \int_0^T (\dot{x})^2\, dt / T^2 — minimising v2dt\int v^2\, dt subject to vdt=X\int v\, dt = X gives constant v=X/Tv = X/T by Cauchy-Schwarz. This is TWAP.

Euler-Lagrange for λ>0\lambda > 0: the integrand is L=ηv2+λσ22x2L = \eta v^2 + \frac{\lambda\sigma^2}{2}x^2 with v=x˙v = -\dot{x}. E-L: ddtLx˙=Lx\frac{d}{dt}\frac{\partial L}{\partial \dot{x}} = \frac{\partial L}{\partial x} gives 2ηx¨=λσ2x-2\eta\ddot{x} = \lambda\sigma^2 x, i.e., x¨=κ2x\ddot{x} = \kappa^2 x with κ=λσ2/(2η)\kappa = \sqrt{\lambda\sigma^2/(2\eta)}. Solution: x(t)=Xsinh(κ(Tt))/sinh(κT)x^*(t) = X\sinh(\kappa(T-t))/\sinh(\kappa T).

L3. Construct the efficient frontier for a sell programme: X=106X = 10^6 shares, T=1T = 1 day, σ=25%\sigma = 25\% annual, η=2.5×107\eta = 2.5 \times 10^{-7}, γ=2.5×108\gamma = 2.5 \times 10^{-8}. What is the minimum-variance strategy and what is its expected shortfall? How would you modify the model to account for an alpha signal predicting a 5 bps decline in the stock over the first 30 minutes?

Minimum variance: achieved as λ\lambda \to \infty (or κ\kappa \to \infty): sell everything immediately. Cost: the immediate temporary impact of selling 10^6 shares at rate v0v_0 \to \infty is bounded by ηv02τηX2/τ\eta v_0^2\tau \to \eta X^2 / \tau — extremely high. The efficient frontier makes this tradeoff explicit: immediate execution has near-zero variance but very high expected IS.

Alpha decay modification: Suppose the stock has expected return μ(t)<0\mu(t) < 0 (falling) for t[0,t]t \in [0, t^*]. The modified objective adds a drift term: E[IS]=η0Tv2dt0Tμ(t)x(t)dt\mathbb{E}[\mathrm{IS}] = \eta\int_0^T v^2\, dt - \int_0^T \mu(t) x(t)\, dt. The Euler-Lagrange equation becomes x¨=κ2x+μ(t)/(2η)\ddot{x} = \kappa^2 x + \mu(t)/(2\eta). The solution front-loads selling during the period of negative expected return — sell faster while the alpha predicts a price decline to minimise the expected mark-to-market loss on the remaining inventory. After tt^* (alpha exhausted), revert to the AC schedule.