Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

cppXplorers : Explorations in C++ for applied sciences

cppXplorers is a collection of experiments and examples in modern C++, with a focus on applied sciences and engineering. The project is organized as a mono-repository, where each crate is independent and self-contained. All the sources can be found in the folder crates of the repository.

The purpose of this repository is to provide a space for exploring numerical methods, algorithms, and patterns in C++ in a practical and modular way. It is not intended as a tutorial or comprehensive learning resource, but rather as a set of working examples and references.

Contributions are welcome. If you spot an issue, find an area that could be improved, or want to add your own example, feel free to open an issue or submit a pull request.

Organisation and layout

This project has the following layout:

├── book
│   └── src
└── crates
    ├── kf_linear
    │   ├── include
    │   ├── src
    │   └── tests
    ├── another_example
    │   ├── include
    │   ├── src
    │   └── tests
    ├── another_example
    │   ├── include
    │   ├── src
    │   └── tests
    ├── ...
    │   ├── include
    │   ├── src
    │   └── tests
    └── simple_optimizers
        ├── include
        ├── src
        └── tests
  • The crates folder contains all the examples. I apologize, I've been doing some Rust lately (rustineers).
  • Each example has its own headers, sources, and tests.
  • The book folder simply contains the sources for generating this book with mdBook.
  • Each chapter in the book will explain what's implemented in each example.

Model formulation

We consider the following general state space model made of a state equation and an observation equation:

where:

  • is the state vector at time step ,
  • is the observation/measurement vector.

We also have the following hyperparameters assumed to be known:

  • is the state transition matrix,
  • is the observation matrix,
  • is the process noise covariance,
  • is the observation noise covariance.

The aim is to estimate the state given the noisy observations .

Kalman filtering

We start with an initial state and we want to determine the conditional distribution of the next state. Using Baye's rule, one has The likelihood can easily be determined thanks to the chosen observation equation. The marginal distribution can be obtained by marginalizing . This marginal distribution can be seen as our prior knowledge before seeing . It can easily be shown that with This first estimate is what we will call the prediction step. Using this result, we can determine : with which gives us our estimate the next state given the observation . This is what we call the update step.

General equations

For an arbitrary time step , the prediction step yields: and the update step is given by where the Kalman gain is defined as

Implementing the Kalman filter boils down to implement these few equations!

C++ implementation

The following code provides a generic templated class KFLinear supporting both fixed-size and dynamic-size state and measurement vectors, using the Eigen linear algebra library.

Click here to view the full implementation: include/kf_linear.hpp. We break into down in the sequel of this section.
#pragma once

#include <Eigen/Dense>
#include <iostream>
#include <optional>
#include <stdexcept>
#include <vector>

namespace cppx::kf_linear {

/**
 * @brief Generic linear Kalman filter (templated, no control term).
 *
 * State-space model:
 *   x_k = A x_{k-1} + w_{k-1},   w ~ N(0, Q)
 *   z_k = H x_k     + v_k,       v ~ N(0, R)
 *
 * Template parameters:
 *   Nx = state dimension      (int or Eigen::Dynamic)
 *   Ny = measurement dimension(int or Eigen::Dynamic)
 */
template <int Nx, int Ny> class KFLinear {
  public:
    using StateVec = Eigen::Matrix<double, Nx, 1>;
    using StateMat = Eigen::Matrix<double, Nx, Nx>;
    using MeasVec = Eigen::Matrix<double, Ny, 1>;
    using MeasMat = Eigen::Matrix<double, Ny, Ny>;
    using ObsMat = Eigen::Matrix<double, Ny, Nx>;

    /// Construct filter with initial condition and model matrices.
    KFLinear(const StateVec &initial_state, const StateMat &initial_covariance,
             const StateMat &transition_matrix, const ObsMat &observation_matrix,
             const StateMat &process_covariance, const MeasMat &measurement_covariance)
        : x_(initial_state), P_(initial_covariance), A_(transition_matrix), H_(observation_matrix),
          Q_(process_covariance), R_(measurement_covariance) {
        std::cout << "DEBUG" << std::endl;
        const auto n = x_.rows();
        if (A_.rows() != n || A_.cols() != n)
            throw std::invalid_argument("A must be n×n and match x dimension");
        if (P_.rows() != n || P_.cols() != n)
            throw std::invalid_argument("P must be n×n");
        if (Q_.rows() != n || Q_.cols() != n)
            throw std::invalid_argument("Q must be n×n");
        if (H_.cols() != n)
            throw std::invalid_argument("H must have n columns");
        const auto m = H_.rows();
        if (R_.rows() != m || R_.cols() != m)
            throw std::invalid_argument("R must be m×m with m = H.rows()");
    }

    /// Predict step (no control).
    void predict() {
        x_ = A_ * x_;
        P_ = A_ * P_ * A_.transpose() + Q_;
    }

    /// Update step with a measurement z.
    void update(const MeasVec &z) {
        // Innovation
        MeasVec nu = z - H_ * x_;

        // Innovation covariance
        MeasMat S = H_ * P_ * H_.transpose() + R_;

        // Solve for K without forming S^{-1}
        Eigen::LDLT<MeasMat> ldlt(S);
        if (ldlt.info() != Eigen::Success) {
            throw std::runtime_error("KFLinear::update: LDLT failed (S not SPD?)");
        }

        // K = P H^T S^{-1}   via solve: S * (K^T) = (P H^T)^T
        const auto PHt = P_ * H_.transpose();                                      // (Nx × Ny)
        Eigen::Matrix<double, Nx, Ny> K = ldlt.solve(PHt.transpose()).transpose(); // (Nx × Ny)

        // State update
        x_ += K * nu;

        // Joseph form (use runtime-sized Identity for Dynamic Nx)
        StateMat I = StateMat::Identity(P_.rows(), P_.cols());
        P_ = (I - K * H_) * P_ * (I - K * H_).transpose() + K * R_ * K.transpose();

        // Re-symmetrize for numerical hygiene
        P_ = 0.5 * (P_ + P_.transpose());
    }

    /// One full step: predict then (optionally) update.
    void step(const std::optional<MeasVec> &measurement) {
        predict();
        if (measurement) {
            update(*measurement);
        }
    }

    /// Run over a sequence of (optional) measurements.
    std::vector<StateVec> filter(const std::vector<std::optional<MeasVec>> &measurements) {
        std::vector<StateVec> out;
        out.reserve(measurements.size());
        for (const auto &z : measurements) {
            step(z);
            out.push_back(x_);
        }
        return out;
    }

    // Accessors
    [[nodiscard]] const StateVec &state() const { return x_; }
    [[nodiscard]] const StateMat &covariance() const { return P_; }

    // (Optional) setters if you want to tweak model online
    void set_transition(const StateMat &A) { A_ = A; }
    void set_observation(const ObsMat &H) { H_ = H; }
    void set_process_noise(const StateMat &Q) { Q_ = Q; }
    void set_measurement_noise(const MeasMat &R) { R_ = R; }

  private:
    // Model
    StateMat A_; ///< State transition
    ObsMat H_;   ///< Observation
    StateMat Q_; ///< Process noise covariance
    MeasMat R_;  ///< Measurement noise covariance

    // Estimates
    StateVec x_; ///< State mean
    StateMat P_; ///< State covariance
};

} // namespace cppx::kf_linear

Here's the header file without the inlined implementations.

#pragma once

#include <Eigen/Dense>
#include <iostream>
#include <optional>
#include <stdexcept>
#include <vector>

/**
 * @brief Generic linear Kalman filter (templated, no control term).
 *
 * State-space model:
 *   x_k = A x_{k-1} + w_{k-1},   w ~ N(0, Q)
 *   z_k = H x_k     + v_k,       v ~ N(0, R)
 *
 * Template parameters:
 *   Nx = state dimension      (int or Eigen::Dynamic)
 *   Ny = measurement dimension(int or Eigen::Dynamic)
 */
template <int Nx, int Ny> class KFLinear {
  public:
    using StateVec = Eigen::Matrix<double, Nx, 1>;
    using StateMat = Eigen::Matrix<double, Nx, Nx>;
    using MeasVec  = Eigen::Matrix<double, Ny, 1>;
    using MeasMat  = Eigen::Matrix<double, Ny, Ny>;
    using ObsMat   = Eigen::Matrix<double, Ny, Nx>;

    KFLinear(const StateVec &initial_state, const StateMat &initial_covariance,
             const StateMat &transition_matrix, const ObsMat &observation_matrix,
             const StateMat &process_covariance, const MeasMat &measurement_covariance);

    void predict();
    void update(const MeasVec &z);
    void step(const std::optional<MeasVec> &measurement);
    std::vector<StateVec> filter(const std::vector<std::optional<MeasVec>> &measurements);

    [[nodiscard]] const StateVec &state() const { return x_; }
    [[nodiscard]] const StateMat &covariance() const { return P_; }

    void set_transition(const StateMat &A)      { A_ = A; }
    void set_observation(const ObsMat &H)       { H_ = H; }
    void set_process_noise(const StateMat &Q)   { Q_ = Q; }
    void set_measurement_noise(const MeasMat &R){ R_ = R; }

  private:
    StateMat A_, Q_, P_;
    ObsMat   H_;
    MeasMat  R_;
    StateVec x_;
};

A few comments:

  • Predict step: The method predict() propagates the state and covariance using the transition matrix A and process noise covariance Q.

  • Update step: The method update(z) corrects the prediction using a new measurement z. It computes the Kalman gain K efficiently by solving a linear system with LDLT factorization instead of forming the matrix inverse explicitly. The covariance update uses the Joseph form to ensure numerical stability and positive semi-definiteness.

  • Step and filter: The step() method combines prediction with an optional update (useful when some measurements are missing). The filter() method processes an entire sequence of measurements, returning the sequence of estimated states.

  • Flexibility:

    • Works with both fixed-size and dynamic-size Eigen matrices.
    • Provides setters to update system matrices online (e.g. if the model changes over time).
    • Uses std::optional to naturally handle missing observations.

Optimizers

This chapter documents the small optimization module used in the project: a minimal runtime‑polymorphic interface Optimizer with two concrete implementations, Gradient Descent and Momentum. It is designed for clarity and easy swapping of algorithms in training loops.

Problem setting

Given parameters and a loss , an optimizer updates weights using the gradient Each algorithm defines an update rule with hyper‑parameters (e.g., learning rate, momentum).

API overview

Click here to view the full implementation: include/cppx/opt/optimizers.hpp. We break into down in the sequel of this section.
#pragma once
#include <algorithm>
#include <cmath>
#include <stdexcept>
#include <vector>

namespace cppx::opt {

class Optimizer {
  public:
    virtual ~Optimizer() = default;

    virtual void step(std::vector<double> &weights, const std::vector<double> &grads) = 0;
};

// --------------------------- GradientDescent ---------------------------
class GradientDescent final : public Optimizer {
  public:
    explicit GradientDescent(double learning_rate) : lr_(learning_rate) {}

    [[nodiscard]] double learning_rate() const noexcept { return lr_; }

    void step(std::vector<double> &weights, const std::vector<double> &grads) override {
        if (weights.size() != grads.size()) {
            throw std::invalid_argument("weights and grads size mismatch");
        }
        for (std::size_t i = 0; i < weights.size(); ++i) {
            weights[i] -= lr_ * grads[i];
        }
    }

  private:
    double lr_{};
};

// ------------------------------- Momentum ------------------------------
class Momentum final : public Optimizer {
  public:
    // struct Params {
    //     double learning_rate;
    //     double momentum;
    //     std::size_t dim;
    // };

    explicit Momentum(double learning_rate, double momentum, std::size_t dim)
        : lr_(learning_rate), mu_(momentum), v_(dim, 0.0) {}

    [[nodiscard]] double learning_rate() const noexcept { return lr_; }
    [[nodiscard]] double momentum() const noexcept { return mu_; }
    [[nodiscard]] const std::vector<double> &velocity() const noexcept { return v_; }

    void step(std::vector<double> &weights, const std::vector<double> &grads) override {
        if (weights.size() != grads.size()) {
            throw std::invalid_argument("weights and grads size mismatch");
        }
        if (v_.size() != weights.size()) {
            throw std::invalid_argument("velocity size mismatch");
        }

        for (std::size_t i = 0; i < weights.size(); ++i) {
            v_[i] = mu_ * v_[i] + lr_ * grads[i]; // v ← μ v + η g
            weights[i] -= v_[i];                  // w ← w − v
        }
    }

  private:
    double lr_{};
    double mu_{};
    std::vector<double> v_;
};

} // namespace cppx::opt

Design choices

  • A small virtual interface to enable swapping algorithms at runtime.
  • std::unique_ptr<Optimizer> for owning polymorphism; borrowing functions accept Optimizer&.
  • Exceptions (std::invalid_argument) signal size mismatches.

Gradient descent

Update rule with learning rate .

Implementation

void GradientDescent::step(std::vector<double>& w,
                           const std::vector<double>& g) {
  if (w.size() != g.size()) throw std::invalid_argument("size mismatch");
  for (std::size_t i = 0; i < w.size(); ++i) {
    w[i] -= lr_ * g[i];
  }
}

Momentum-based gradient descent

Update rule with momentum and learning rate .

Implementation

Momentum::Momentum(double learning_rate, double momentum, std::size_t dim)
  : lr_(learning_rate), mu_(momentum), v_(dim, 0.0) {}

void Momentum::step(std::vector<double>& w, const std::vector<double>& g) {
  if (w.size() != g.size()) throw std::invalid_argument("size mismatch");
  if (v_.size() != w.size()) throw std::invalid_argument("velocity size mismatch");

  for (std::size_t i = 0; i < w.size(); ++i) {
    v_[i] = mu_ * v_[i] + lr_ * g[i];
    w[i] -= v_[i];
  }
}

Using the optimizers

Owning an optimizer (runtime polymorphism)

#include <memory>
#include "cppx/opt/optimizers.hpp"

using namespace cppx::opt;

std::vector<double> w(d, 0.0), g(d, 0.0);

// Choose an algorithm at runtime:
std::unique_ptr<Optimizer> opt =
    std::make_unique<Momentum>(/*lr=*/0.1, /*mu=*/0.9, /*dim=*/w.size());

for (int epoch = 0; epoch < 100; ++epoch) {
  // ... compute gradients into g ...
  opt->step(w, g);           // updates w in place
}

Borrowing an optimizer (no ownership transfer)

void train_one_epoch(Optimizer& opt,
                     std::vector<double>& w,
                     std::vector<double>& g) {
  // ... fill g ...
  opt.step(w, g);
}

API variations (optional)

If C++20 is available, std::span can make the interface container‑agnostic:

// virtual void step(std::span<double> w, std::span<const double> g) = 0;

Autoregressive models

This chapter documents a header-only implementation of an Autoregressive model AR(p) in modern C++.
It explains the class template, the OLS and Yule–Walker estimators, and small-but-important C++ details you asked about.

AR(p) refresher

An AR(p) process is with intercept , coefficients , and i.i.d. noise .

Header overview

#pragma once
#include <Eigen/Dense>
#include <iostream>
#include <numeric>

namespace cppx::ar_models {

template <int order> class ARModel {
  public:
    using Vector = Eigen::Matrix<double, order, 1>;

    ARModel() = default;
    ARModel(double intercept, double noise_variance) : c_(intercept), sigma2_(noise_variance){};

    [[nodiscard]] double intercept() const noexcept { return c_; }
    [[nodiscard]] double noise() const noexcept { return sigma2_; }
    [[nodiscard]] const Vector &coefficients() const noexcept { return phi_; }

    void set_coefficients(const Vector &phi) { phi_ = phi; }
    void set_intercept(double c) { c_ = c; }
    void set_noise(double noise) { sigma2_ = noise; }

    double forecast_one_step(const std::vector<double> &hist) const {
        if (static_cast<int>(hist.size()) < order) {
            throw std::invalid_argument("History shorter than model order");
        }
        double y = c_;
        for (int i = 0; i < order; ++i) {
            y += phi_(i) * hist[i];
        }
        return y;
    }

  private:
    Vector phi_;
    double c_ = 0.0;
    double sigma2_ = 1.0;
};

Notes

  • using Vector = Eigen::Matrix<double, order, 1>; is the correct Eigen alias (there is no Eigen::Vector<double, N> type).
  • Defaulted constructor + in-class member initializers (C++11) keep initialization simple.
  • [[nodiscard]] marks return values that shouldn’t be ignored.
  • static_cast<int> is used because std::vector::size() returns size_t (unsigned), while Eigen commonly uses int sizes.

Forecasting (one-step)

double forecast_one_step(const std::vector<double> &hist) const {
    if (static_cast<int>(hist.size()) < order) {
        throw std::invalid_argument("History shorter than model order");
    }
    double y = c_;
    for (int i = 0; i < order; ++i) {
        y += phi_(i) * hist[i];            // hist[0]=X_T, hist[1]=X_{T-1}, ...
    }
    return y;
}

The one-step-ahead plug‑in forecast equals .

OLS estimator (header-only)

Mathematically, we fit

Define:

  • , where .
  • the design matrix:

The regression model is

The OLS estimator is

Residual variance estimate:

In code, we solve this with Eigen’s QR decomposition:

Eigen::VectorXd beta = X.colPivHouseholderQr().solve(Y);
template <int order>
ARModel<order> fit_ar_ols(const std::vector<double> &x) {
    if (static_cast<int>(x.size()) <= order) {
        throw std::invalid_argument("Time series too short for AR(order)");
    }
    const int T = static_cast<int>(x.size());
    const int n = T - order;

    Eigen::MatrixXd X(n, order + 1);
    Eigen::VectorXd Y(n);

    for (int t = 0; t < n; ++t) {
        Y(t) = x[order + t];
        X(t, 0) = 1.0;                          // intercept column
        for (int j = 0; j < order; ++j) {
            X(t, j + 1) = x[order + t - 1 - j]; // lagged regressors (most-recent-first)
        }
    }

    Eigen::VectorXd beta = X.colPivHouseholderQr().solve(Y);
    Eigen::VectorXd resid = Y - X * beta;
    const double sigma2 = resid.squaredNorm() / static_cast<double>(n - (order + 1));

    typename ARModel<order>::Vector phi;
    for (int j = 0; j < order; ++j) phi(j) = beta(j + 1);

    ARModel<order> model;
    model.set_coefficients(phi);
    model.set_intercept(beta(0));   // beta(0) is the intercept
    model.set_noise(sigma2);
    return model;
}

Yule–Walker (Levinson–Durbin)

The AR() autocovariance equations are:

where .

This leads to the Yule–Walker system:

We estimate autocovariances by

Levinson–Durbin recursion

Efficiently solves the Toeplitz system in time.
At each step:

  • Update reflection coefficient ,
  • Update AR coefficients ,
  • Update innovation variance

The final variance is the residual variance estimate.

inline double _sample_mean(const std::vector<double> &x) {
    return std::accumulate(x.begin(), x.end(), 0.0) / x.size();
}
inline double _sample_autocov(const std::vector<double> &x, int k) {
    const int T = static_cast<int>(x.size());
    if (k >= T) throw std::invalid_argument("lag too large");
    const double mu = _sample_mean(x);
    double acc = 0.0;
    for (int t = k; t < T; ++t) acc += (x[t]-mu) * (x[t-k]-mu);
    return acc / static_cast<double>(T);
}
  • std::vector has no .mean(); we compute it with std::accumulate (from <numeric>).
  • For compile-time sizes (since order is a template parameter) we can use std::array<double, order+1> to hold autocovariances.

Levinson–Durbin recursion:

template <int order>
ARModel<order> fit_ar_yule_walkter(const std::vector<double> &x) {
    static_assert(order >= 1, "Yule–Walker needs order >= 1");
    if (static_cast<int>(x.size()) <= order) {
        throw std::invalid_argument("Time series too short for AR(order)");
    }

    std::array<double, order + 1> r{};
    for (int k = 0; k <= order; ++k) r[k] = _sample_autocov(x, k);

    typename ARModel<order>::Vector a; a.setZero();
    double E = r[0];
    if (std::abs(E) < 1e-15) throw std::runtime_error("Zero variance");

    for (int m = 1; m <= order; ++m) {
        double acc = r[m];
        for (int j = 1; j < m; ++j) acc -= a(j - 1) * r[m - j];
        const double kappa = acc / E;

        typename ARModel<order>::Vector a_new = a;
        a_new(m - 1) = kappa;
        for (int j = 1; j < m; ++j) a_new(j - 1) = a(j - 1) - kappa * a(m - j - 1);
        a = a_new;

        E *= (1.0 - kappa * kappa);
        if (E <= 0) throw std::runtime_error("Non-positive innovation variance in recursion");
    }

    const double xbar = _sample_mean(x);
    const double c = (1.0 - a.sum()) * xbar;   // intercept so that mean(model) == sample mean

    ARModel<order> model;
    model.set_coefficients(a);
    model.set_intercept(c);
    model.set_noise(E);
    return model;
}

Small questions I asked myself while implementing this

  • The class holds parameters + forecasting but the algorithms live outside. This way, I can add/replace estimators without modifying the class.

  • typename ARModel<order>::Vector: why the typename? Inside templates, dependent names might be types or values. typename tells the compiler it’s a type.

  • std::array vs std::vector? std::array<T,N> is fixed-size (size known at compile time) and stack-allocated while std::vector<T> is dynamic-size (runtime) and heap-allocated.

  • Why static_cast<int>(hist.size())? .size() returns size_t (unsigned). Converting explicitly avoids signed/unsigned warnings and matches Eigen’s int-based indices.

Example of usage

#include "ar.hpp"
#include <iostream>
#include <vector>

int main() {
    std::vector<double> x = {0.1, 0.3, 0.7, 0.8, 1.2, 1.0, 0.9};

    auto m = fit_ar_ols<2>(x);
    std::cout << "c=" << m.intercept() << ", sigma^2=" << m.noise()
              << ", phi=" << m.coefficients().transpose() << "\n";

    std::vector<double> hist = {x.back(), x[x.size()-2]}; // [X_T, X_{T-1}]
    std::cout << "one-step forecast: " << m.forecast_one_step(hist) << "\n";
}