Introduction

In this book, we explore mid-price prediction in financial markets through the combined lens of statistical filtering and machine learning. The mid-price—halfway between the best bid and best ask—captures the evolving consensus of market participants and serves as a natural target for short-term price forecasting.

We begin by implementing a Kalman Filter as a statistical baseline for sequential state estimation. From there, we train and evaluate a range of machine learning models to assess how modern approaches compare with classical inference methods.

Our goals are two-fold:

Implement classical inference algorithms such as the Kalman Filter in C++ for efficiency and precision, with Python bindings for experimentation. Python bindings will be provided as well.
Compare these algorithms against machine learning models in terms of predictive accuracy, robustness, and computational performance.

The comparison will be carried out on classical mid-price forecasting datasets, including:

FI-2010: a publicly available benchmark dataset for mid-price forecasting for limit order book data
LOBster: a real limit order book dataset with millisecond-level resolution

By the end, we should have a practical understanding of how statistical filters and machine learning can be applied to mid-price prediction.

Keyboard shortcuts

Simple algorithms for limit order book mid-price prediction

Introduction