C++Abstract ClassesPure VirtualClone PatternPolymorphic Ownership

Abstract Classes and the Clone Pattern

Module 4 of 822 min readLevel: Medium

Setup

The Polymorphic Ownership Problem

The previous module showed how virtual dispatch allows a Model* to point to either a BlackScholesModel or a DupireLocalVolatilityModel at runtime. But as soon as you introduce polymorphism, you face a question that the language does not answer for you:

How do you copy a polymorphic object?

Consider PathSimulator, which owns a Model*. When you copy a PathSimulator, you want a deep copy of the Model it points to — not a pointer alias (which would cause a double-free) and not a copy of the base class slice (which would lose the derived-class state). The copy constructor cannot call new Model(*_model) because Model is abstract and because even if it were concrete, that would slice the object.

The solution is the clone pattern: each class in the hierarchy implements a virtual clone() method that returns a heap-allocated copy of itself, preserving the fully derived type. This is sometimes called the virtual copy constructor idiom.

This module builds the pattern from scratch using the Distribution hierarchy: an abstract base representing a probability distribution, with concrete NormalDistribution and LogNormalDistribution derived classes. The same pattern appears verbatim in the production MC pricer (Model::clone(), PathSimulator::clone(), Payoff::clone()).

Conventions used throughout:

  • All class definitions use the interface/implementation split (.h / .cpp).
  • The base class destructor is virtual. Without it, deleting a derived object through a base pointer is undefined behaviour.
  • Raw owning pointers are used here to expose the mechanics. In production code, wrap in std::unique_ptr.

Theory

1. Abstract Classes and Pure Virtual Methods

A class is abstract if it declares at least one pure virtual method:

virtual double pdf(double x) const = 0;  // pure virtual

The = 0 syntax tells the compiler: this class provides no implementation for pdf; any concrete derived class must provide one. An abstract class cannot be instantiated directly — attempting to write Distribution d("normal") is a compile error.

Pure virtual methods define an interface contract. Every class that inherits from Distribution and wishes to be instantiable must override pdf. If a derived class fails to override a pure virtual method, it too becomes abstract.

The override keyword in derived classes is mandatory on this platform:

double pdf(double x) const override;  // confirms we are overriding a base virtual

The compiler catches a silent bug: if you misspell the method name or change the signature, override produces a compile error instead of silently creating a new unrelated method.

2. The Object Slicing Problem

Slicing occurs when a derived object is copied into a base object, discarding the derived parts:

NormalDistribution nd("N", 0.0, 1.0);
Distribution d = nd;   // SLICED: d is a Distribution, _mean and _variance are gone

Slicing is silent and catastrophic in a numerical context. If Distribution had a data member _name and NormalDistribution adds _mean and _variance, the sliced copy drops both parameters. Any subsequent pdf call on the copy would compute rubbish — or, if pdf remained pure virtual, the copy would fail to compile anyway.

In a pricing context, slicing turns a calibrated stochastic vol model into an empty shell. The rule:

Polymorphic base classes should be copied only through the clone pattern — never by value.

Enforce this by deleting the copy operations in the base class or by making the base class move-only.

3. The Clone Pattern

The solution is to delegate copying to a virtual method:

class Distribution {
public:
    virtual Distribution* clone() const = 0;
    // ...
};

class NormalDistribution : public Distribution {
public:
    NormalDistribution* clone() const override {
        return new NormalDistribution(*this);  // calls NormalDistribution copy ctor
    }
};

Three properties make this work:

  1. Covariant return type: NormalDistribution* is a valid override of Distribution* because NormalDistribution is derived from Distribution. The caller holding a Distribution* gets a Distribution*; the caller holding a NormalDistribution* gets a NormalDistribution*. No cast required in either case.

  2. Full derived type preserved: new NormalDistribution(*this) calls the NormalDistribution copy constructor, which copies all members including _mean and _variance. No slicing.

  3. Heap allocation: clone() always returns a heap pointer. The caller owns the returned object and is responsible for its deallocation (or wraps it in std::unique_ptr<Distribution>).

4. The Rule of Five for Polymorphic Classes

When a class manages a resource (here: owns a raw pointer to a heap-allocated Distribution), you must define or suppress all five special member functions consistently.

MemberPurposeAction for owning container
DestructorRelease owned resourcedelete _distribution;
Copy constructorDeep-copy owned resource_distribution = other._distribution->clone();
Copy assignmentDeep-copy, release olddelete _distribution; _distribution = other._distribution->clone();
Move constructorTransfer ownership_distribution = other._distribution; other._distribution = nullptr;
Move assignmentTransfer, release olddelete _distribution; _distribution = other._distribution; other._distribution = nullptr;

If you define a destructor that deletes the pointer but omit the copy constructor, the compiler generates a memberwise copy — both the original and the copy share the same raw pointer. When the first destructor fires, the pointer is freed. When the second destructor fires, the pointer is freed again: double-free, undefined behaviour.

The copy constructor of PathSimulator uses clone precisely to avoid this:

PathSimulator::PathSimulator(const PathSimulator& ps)
    : _time_points(ps._time_points),
      _model(ps._model->clone()),         // deep copy, not pointer copy
      _initial_value(ps._initial_value)
{}

5. Self-Assignment Guard

Copy assignment must handle the a = a case correctly:

Distribution& Distribution::operator=(const Distribution& other) {
    if (this != &other) {
        _name = other._name;
    }
    return *this;
}

Without the guard, a class that deletes its old resource and then tries to copy from other would be reading freed memory in the pathological case where this == &other. The guard costs one pointer comparison; it is always worth writing.

For move operations, the guard is even more important: a moved-from object is left in a valid but unspecified state, and moving from yourself is logically incoherent.


Implementation

The full Distribution hierarchy follows the pattern exactly. The header separates interface from implementation; the .cpp provides all method bodies.

Distribution.h

#pragma once
#include <string>

// Abstract base: represents any continuous probability distribution.
// Cannot be instantiated directly (pure virtual pdf).
class Distribution {
public:
    explicit Distribution(const std::string& name);
    Distribution(const Distribution& other);
    Distribution& operator=(const Distribution& other);
    virtual ~Distribution() = default;  // virtual: essential for correct delete through base ptr

    // Pure virtual interface — derived classes must implement.
    virtual double pdf(double x) const = 0;

    // Clone pattern: returns a heap-allocated copy of the fully derived object.
    virtual Distribution* clone() const = 0;

    // Non-virtual: implemented here using pdf() via numerical integration.
    double cdf(double y) const;

protected:
    std::string _name;
};

class NormalDistribution : public Distribution {
public:
    NormalDistribution(const std::string& name, double mean, double variance);
    NormalDistribution(const NormalDistribution& other);
    NormalDistribution& operator=(const NormalDistribution& other);
    ~NormalDistribution() = default;

    double pdf(double x) const override;
    NormalDistribution* clone() const override;  // covariant return type

private:
    double _mean;
    double _variance;
};

class LogNormalDistribution : public Distribution {
public:
    LogNormalDistribution(const std::string& name, double mean, double variance);
    LogNormalDistribution(const LogNormalDistribution& other);
    LogNormalDistribution& operator=(const LogNormalDistribution& other);
    ~LogNormalDistribution() = default;

    double pdf(double x) const override;
    LogNormalDistribution* clone() const override;

private:
    double _mean;
    double _variance;
};

Distribution.cpp

#include "Distribution.h"
#include <cmath>
#include <stdexcept>

// ── Distribution (base) ──────────────────────────────────────────────────────

Distribution::Distribution(const std::string& name)
    : _name(name) {}

Distribution::Distribution(const Distribution& other)
    : _name(other._name) {}

Distribution& Distribution::operator=(const Distribution& other) {
    if (this != &other)
        _name = other._name;
    return *this;
}

double Distribution::cdf(double y) const {
    // Simple midpoint rule numerical integration of pdf from -10 to y.
    // Not high-precision; use analytic forms in production.
    const int n = 10000;
    const double lo = -10.0;
    double h = (y - lo) / n;
    double sum = 0.0;
    for (int i = 0; i < n; ++i)
        sum += pdf(lo + (i + 0.5) * h);
    return sum * h;
}

// ── NormalDistribution ───────────────────────────────────────────────────────

NormalDistribution::NormalDistribution(const std::string& name, double mean, double variance)
    : Distribution(name), _mean(mean), _variance(variance) {
    if (variance <= 0.0)
        throw std::invalid_argument("NormalDistribution: variance must be positive");
}

NormalDistribution::NormalDistribution(const NormalDistribution& other)
    : Distribution(other), _mean(other._mean), _variance(other._variance) {}

NormalDistribution& NormalDistribution::operator=(const NormalDistribution& other) {
    if (this != &other) {
        Distribution::operator=(other);  // copy base portion
        _mean     = other._mean;
        _variance = other._variance;
    }
    return *this;
}

// N(x; μ, σ²) = (1 / √(2πσ²)) · exp(-(x-μ)² / 2σ²)
double NormalDistribution::pdf(double x) const {
    const double pi  = 3.14159265358979323846;
    double sigma     = std::sqrt(_variance);
    double z         = (x - _mean) / sigma;
    return std::exp(-0.5 * z * z) / (sigma * std::sqrt(2.0 * pi));
}

NormalDistribution* NormalDistribution::clone() const {
    return new NormalDistribution(*this);
}

// ── LogNormalDistribution ────────────────────────────────────────────────────

LogNormalDistribution::LogNormalDistribution(const std::string& name, double mean, double variance)
    : Distribution(name), _mean(mean), _variance(variance) {
    if (variance <= 0.0)
        throw std::invalid_argument("LogNormalDistribution: variance must be positive");
}

LogNormalDistribution::LogNormalDistribution(const LogNormalDistribution& other)
    : Distribution(other), _mean(other._mean), _variance(other._variance) {}

LogNormalDistribution& LogNormalDistribution::operator=(const LogNormalDistribution& other) {
    if (this != &other) {
        Distribution::operator=(other);
        _mean     = other._mean;
        _variance = other._variance;
    }
    return *this;
}

// LN(x; μ, σ²) = (1 / (xσ√(2π))) · exp(-(ln x - μ)² / 2σ²),  x > 0
double LogNormalDistribution::pdf(double x) const {
    if (x <= 0.0) return 0.0;
    const double pi  = 3.14159265358979323846;
    double sigma     = std::sqrt(_variance);
    double z         = (std::log(x) - _mean) / sigma;
    return std::exp(-0.5 * z * z) / (x * sigma * std::sqrt(2.0 * pi));
}

LogNormalDistribution* LogNormalDistribution::clone() const {
    return new LogNormalDistribution(*this);
}

main.cpp — Demonstrating Polymorphic Ownership

#include "Distribution.h"
#include <iostream>
#include <memory>
#include <vector>

int main() {
    // Build a heterogeneous collection of distributions through base pointers.
    std::vector<Distribution*> pool;
    pool.push_back(new NormalDistribution("std-normal", 0.0, 1.0));
    pool.push_back(new LogNormalDistribution("stock-return", 0.0, 0.04));

    // Polymorphic dispatch: each call resolves to the correct pdf at runtime.
    for (const auto* d : pool)
        std::cout << "pdf(1.0) = " << d->pdf(1.0) << "\n";

    // Clone: deep-copy the entire collection without knowing concrete types.
    std::vector<Distribution*> clones;
    for (const auto* d : pool)
        clones.push_back(d->clone());

    // Verify independence: modify clone, original unaffected.
    // (Here just confirm they produce the same value before cleanup.)
    for (size_t i = 0; i < pool.size(); ++i)
        std::cout << "original == clone: "
                  << (pool[i]->pdf(1.0) == clones[i]->pdf(1.0) ? "yes" : "no") << "\n";

    for (auto* d : pool)   delete d;
    for (auto* d : clones) delete d;

    // Modern alternative: std::unique_ptr eliminates manual delete.
    std::vector<std::unique_ptr<Distribution>> owned;
    owned.push_back(std::make_unique<NormalDistribution>("n01", 0.0, 1.0));
    // Clone into unique_ptr: wrap the raw pointer returned by clone().
    owned.push_back(std::unique_ptr<Distribution>(owned[0]->clone()));

    return 0;
}

Validation

The following invariants are verifiable analytically:

DistributionxxExpected pdf(x)\text{pdf}(x)Derivation
N(0,1)\mathcal{N}(0,1)01/2π0.398941/\sqrt{2\pi} \approx 0.39894Gaussian peak
N(0,1)\mathcal{N}(0,1)1e1/2/2π0.24197e^{-1/2}/\sqrt{2\pi} \approx 0.24197One standard deviation
LN(0,1)\text{LN}(0,1)11/2π0.398941/\sqrt{2\pi} \approx 0.39894ln(1)=0\ln(1)=0, peak of the log-normal
N(0,1)\mathcal{N}(0,1)pdf(x)dx=1\int_{-\infty}^{\infty} \text{pdf}(x)\,dx = 1Normalisation

The CDF of N(0,1)\mathcal{N}(0,1) at x=0x=0 should equal 0.5 by symmetry; use this to verify the numerical integration in cdf().

The clone invariant: for any Distribution* d, the following must hold:

std::unique_ptr<Distribution> c(d->clone());
assert(c->pdf(1.0) == d->pdf(1.0));
assert(c.get() != d);          // distinct object on the heap

Limitations

Raw pointer ownership is error-prone. The implementation above uses raw pointers to expose mechanics clearly. In production code, clone() should return std::unique_ptr<Distribution>. C++20 allows covariant returns with unique_ptr via explicit casting:

std::unique_ptr<Distribution> clone() const override {
    return std::make_unique<NormalDistribution>(*this);
}

The CDF is approximate. The cdf() method uses a fixed-step midpoint rule over [10,y][-10, y]. This is adequate for illustration but is O(h2)O(h^2) accurate with h=(y(10))/10000h = (y-(-10))/10000. For N(0,1)\mathcal{N}(0,1), the analytic CDF via the error function is exact: Φ(x)=12[1+erf(x/2)]\Phi(x) = \frac{1}{2}\left[1 + \text{erf}(x/\sqrt{2})\right].

No thread safety. Virtual dispatch is read-only and thread-safe. The operator= and clone operations are not protected by a mutex; concurrent use requires external synchronisation or immutable objects.

Virtual destructor overhead. Every delete through a base pointer incurs a vtable lookup. This is negligible for distribution objects but meaningful if you are cloning millions of objects per second in a Monte Carlo loop — at that scale, use value types with std::variant or std::function.


Interview Angle

Junior (L1): Explain what a pure virtual function is and why you cannot instantiate an abstract class. Describe what happens if you delete a derived object through a base pointer without a virtual destructor.

The most common wrong answer: "it just deletes the base part". The correct answer: it is undefined behaviour — the derived destructor does not run, so resources managed by the derived class are leaked. The destructor call dispatches through a raw function pointer (not through the vtable) and the wrong destructor is invoked.

Senior (L2): Why does clone() return a raw pointer rather than a reference or a value? Why is the return type covariant?

References cannot be null and do not transfer ownership — returning by reference would imply the caller borrows an existing object, not receives a newly allocated one. Returning by value would require slicing or a value-semantic hierarchy. Covariance (NormalDistribution* overriding Distribution*) allows callers holding a concrete type pointer to receive a concrete pointer without a cast.

Researcher (L3): Design the Distribution hierarchy to be fully value-semantic (no raw pointers in the public API) while still supporting runtime polymorphism across an arbitrary number of distribution types not known at compile time.

The solution is type erasure via std::any or a custom wrapper — wrapping a std::unique_ptr<DistributionConcept> inside a Distribution value type that forwards calls through the pointer. The wrapper's copy constructor calls clone() internally. This is the technique used in std::function and the basis of the concept-model idiom (Lakos 2021, Niebler 2020).