Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Putting things together

To wrap up our 1D Ridge Regression example, let's see how all the parts fit together into a real Rust crate.

Project layout

Here’s the directory structure for our ridge_1d_fn crate:

crates/ridge_1d_fn/
├── Cargo.toml
└── src
    ├── estimator.rs           # Closed-form solution of the Ridge estimator
    ├── gradient_descent.rs    # Gradient descent solution
    ├── lib.rs                 # Main entry point for the library
    └── loss_functions.rs      # Loss function implementations

All the functions discussed in the previous sections are implemented in estimator.rs, loss_functions.rs, gradient_descent.rs. You can inspect each of these files below.

Click to view estimator.rs
#![allow(unused)]
fn main() {
/// Computes the one-dimensional Ridge regression estimator using centered data.
///
/// This version centers the input data `x` and `y` before applying the closed-form formula.
///
/// # Arguments
///
/// * `x` - A slice of input features.
/// * `y` - A slice of target values (same length as `x`).
/// * `lambda2` - The regularization parameter.
///
/// # Returns
///
/// * `f64` - The estimated Ridge regression coefficient.
///
/// # Panics
///
/// Panics if `x` and `y` do not have the same length.
// ANCHOR: ridge_estimator
pub fn ridge_estimator(x: &[f64], y: &[f64], lambda2: f64) -> f64 {
    let n: usize = x.len();
    assert_eq!(n, y.len(), "x and y must have the same length");

    let x_mean: f64 = x.iter().sum::<f64>() / n as f64;
    let y_mean: f64 = y.iter().sum::<f64>() / n as f64;

    let num: f64 = x
        .iter()
        .zip(y)
        .map(|(xi, yi)| (xi - x_mean) * (yi - y_mean))
        .sum::<f64>();

    let denom: f64 = x.iter().map(|xi| (xi - x_mean).powi(2)).sum::<f64>() + lambda2 * (n as f64);

    num / denom
}
// ANCHOR_END: ridge_estimator

// ANCHOR: tests
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_ridge_estimator() {
        let x: Vec<f64> = vec![1.0, 2.0];
        let y: Vec<f64> = vec![0.1, 0.2];
        let true_beta: f64 = 0.1;
        let lambda2: f64 = 0.0;

        let beta_estimate: f64 = ridge_estimator(&x, &y, lambda2);
        assert!(
            (true_beta - beta_estimate).abs() < 1e-6,
            "Estimate {} not close enough to true solution {}",
            beta_estimate,
            true_beta
        );
    }
}
// ANCHOR_END: tests
}
Click to view gradient_descent.rs
#![allow(unused)]
fn main() {
/// Dot product between two vectors.
///
/// # Arguments
/// * `a` - First input vector
/// * `b` - Second input vector
///
/// # Returns
///
/// The float value of the dot product.
///
/// # Panics
///
/// Panics if `a` and `b` do have the same length.
// ANCHOR: dot
pub fn dot(a: &[f64], b: &[f64]) -> f64 {
    assert_eq!(a.len(), b.len(), "Input vectors must have the same length");
    a.iter().zip(b.iter()).map(|(xi, yi)| xi * yi).sum()
}

// ANCHOR_END: dot
/// Computes the gradient of the Ridge regression loss function (naive version).
///
/// This implementation first explicitly computes the residuals and then performs
/// a dot product between the residuals and the inputs.
///
/// # Arguments
///
/// * `x` - Slice of input features
/// * `y` - Slice of target outputs
/// * `beta` - Coefficient of the regression model
/// * `lambda2` - L2 regularization strength
///
/// # Returns
///
/// The gradient of the loss with respect to `beta`.
///
/// # Panics
///
/// Panics if `x` and `y` do not have the same length.
// ANCHOR: grad_loss_function_naive
pub fn grad_loss_function_naive(x: &[f64], y: &[f64], beta: f64, lambda2: f64) -> f64 {
    assert_eq!(x.len(), y.len(), "x and y must have the same length");

    let n: usize = x.len();
    let residuals: Vec<f64> = x
        .iter()
        .zip(y.iter())
        .map(|(xi, yi)| yi - beta * xi)
        .collect();
    let residuals_dot_x = dot(&residuals, x);

    -2.0 * residuals_dot_x / (n as f64) + 2.0 * lambda2 * beta
}
// ANCHOR_END: grad_loss_function_naive

/// Computes the gradient of the Ridge regression loss function (inlined version).
///
/// This version fuses the residual and gradient computation into a single pass
/// using iterators, minimizing allocations and improving efficiency.
///
/// # Arguments
///
/// * `x` - Slice of input features
/// * `y` - Slice of target outputs
/// * `beta` - Coefficient of the regression model
/// * `lambda2` - L2 regularization strength
///
/// # Returns
///
/// The gradient of the loss with respect to `beta`.
///
/// # Panics
///
/// Panics if `x` and `y` do not have the same length.
// ANCHOR: grad_loss_function_inline
pub fn grad_loss_function_inline(x: &[f64], y: &[f64], beta: f64, lambda2: f64) -> f64 {
    assert_eq!(x.len(), y.len(), "x and y must have the same length");

    let n: usize = x.len();
    let grad_mse: f64 = x
        .iter()
        .zip(y.iter())
        .map(|(xi, yi)| 2.0 * (yi - beta * xi) * xi)
        .sum::<f64>()
        / (n as f64);

    -grad_mse + 2.0 * lambda2 * beta
}
// ANCHOR_END: grad_loss_function_inline

/// Performs gradient descent to minimize the Ridge regression loss function.
///
/// # Arguments
///
/// * `grad_fn` - A function that computes the gradient of the Ridge loss
/// * `x` - Input features as a slice (`&[f64]`)
/// * `y` - Target values as a slice (`&[f64]`)
/// * `lambda2` - Regularization parameter
/// * `lr` - Learning rate
/// * `n_iters` - Number of gradient descent iterations
/// * `init_beta` - Initial value of the regression coefficient
///
/// # Returns
///
/// The optimized regression coefficient `beta` after `n_iters` updates
// ANCHOR: gradient_descent_estimator
pub fn ridge_estimator(
    grad_fn: impl Fn(&[f64], &[f64], f64, f64) -> f64,
    x: &[f64],
    y: &[f64],
    lambda2: f64,
    lr: f64,
    n_iters: usize,
    init_beta: f64,
) -> f64 {
    let mut beta = init_beta;

    for _ in 0..n_iters {
        let grad = grad_fn(x, y, beta, lambda2);
        beta -= lr * grad;
    }

    beta
}
// ANCHOR_END: gradient_descent_estimator

// ANCHOR: tests
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_grad_naive() {
        let x: Vec<f64> = vec![1.0, 2.0];
        let y: Vec<f64> = vec![0.1, 0.2];
        let beta: f64 = 0.1;
        let lambda2: f64 = 1.0;

        let grad = grad_loss_function_naive(&x, &y, beta, lambda2);
        let expected_grad = 0.2;
        let tol = 1e-6;
        assert!(
            (grad - expected_grad).abs() < tol,
            "Expected {}, got {}",
            expected_grad,
            grad
        );
    }

    #[test]
    fn test_grad_inline() {
        let x: Vec<f64> = vec![1.0, 2.0];
        let y: Vec<f64> = vec![0.1, 0.2];
        let beta: f64 = 0.1;
        let lambda2: f64 = 1.0;

        let grad = grad_loss_function_inline(&x, &y, beta, lambda2);
        let expected_grad = 0.2;
        let tol = 1e-6;
        assert!(
            (grad - expected_grad).abs() < tol,
            "Expected {}, got {}",
            expected_grad,
            grad
        );
    }

    #[test]
    fn test_naive_vs_inline() {
        let x: Vec<f64> = vec![1.0, 2.0];
        let y: Vec<f64> = vec![0.1, 0.2];
        let beta: f64 = 0.1;
        let lambda2: f64 = 1.0;

        let grad1 = grad_loss_function_inline(&x, &y, beta, lambda2);
        let grad2 = grad_loss_function_naive(&x, &y, beta, lambda2);
        assert_eq!(grad1, grad2);
    }
}
// ANCHOR_END: tests
}
Click to view loss_functions.rs
#![allow(unused)]
fn main() {
/// Multiplies a vector by a scalar.
///
/// # Arguments
///
/// * `scalar` - A scalar multiplier
/// * `vector` - A slice of f64 values
///
/// # Returns
///
/// A new vector containing the result of element-wise multiplication
///
/// # Why `&[f64]` instead of `Vec<f64]`?
///
/// We use a slice (`&[f64]`) because:
/// - It's more general: works with both arrays and vectors
/// - It avoids unnecessary ownership
/// - It's idiomatic and Clippy-compliant
// ANCHOR: mul_scalar_vec
pub fn mul_scalar_vec(scalar: f64, vector: &[f64]) -> Vec<f64> {
    vector.iter().map(|x| x * scalar).collect()
}
// ANCHOR_END: mul_scalar_vec

/// Subtracts two vectors element-wise.
///
/// # Arguments
///
/// * `a` - First input slice
/// * `b` - Second input slice
///
/// # Returns
///
/// A new `Vec<f64>` containing the element-wise difference `a[i] - b[i]`.
///
/// # Panics
///
/// Panics if `a` and `b` do not have the same length.
// ANCHOR: subtract_vectors
pub fn subtract_vectors(a: &[f64], b: &[f64]) -> Vec<f64> {
    assert_eq!(a.len(), b.len(), "Input vectors must have the same length");
    a.iter().zip(b.iter()).map(|(x, y)| x - y).collect()
}
// ANCHOR_END: subtract_vectors

/// Computes the loss function for Ridge regression (naive version).
///
/// It implements it in a simple fashion by computing the mean squared error in multiple steps.
///
/// # Arguments
///
/// * `x` - The array of input observations
/// * `y` - The array of output observations
/// * `beta` - The coefficients of the linear regression
/// * `lambda2` - The regularization parameter
///
/// # Returns
///
/// The value of the loss function
/// Computes the Ridge regression loss function.
///
/// This function calculates the following expression:
///
/// $$
/// \mathcal{L}(\beta) = \frac{1}{2n} \sum_i (y_i - \beta x_i)^2 + \lambda \beta^2
/// $$
///
/// where:
/// - `x` and `y` are the input/output observations,
/// - `beta` is the linear coefficient,
/// - `lambda2` is the regularization strength.
///
/// # Arguments
///
/// * `x` - Input features as a slice (`&[f64]`)
/// * `y` - Target values as a slice (`&[f64]`)
/// * `beta` - Coefficient of the regression model
/// * `lambda2` - L2 regularization strength
///
/// # Returns
///
/// The Ridge regression loss value as `f64`.
///
/// # Panics
///
/// Panics if `x` and `y` do not have the same length.
// ANCHOR: loss_function_naive
pub fn loss_function_naive(x: &[f64], y: &[f64], beta: f64, lambda2: f64) -> f64 {
    assert_eq!(x.len(), y.len(), "x and y must have the same length");

    let n: usize = x.len();
    let y_hat: Vec<f64> = mul_scalar_vec(beta, x);
    let residuals: Vec<f64> = subtract_vectors(y, &y_hat);
    let mse: f64 = residuals.iter().map(|x| x * x).sum::<f64>() / (n as f64);
    mse + lambda2 * beta * beta
}
// ANCHOR_END: loss_function_naive

/// Computes the loss function for Ridge regression (inlined version).
///
/// It implements it as a one-liner by computing the mean squared error in a single expression.
///
/// # Arguments
///
/// * `x` - The array of input observations
/// * `y` - The array of output observations
/// * `beta` - The coefficients of the linear regression
/// * `lambda2` - The regularization parameter
///
/// # Returns
///
/// The value of the loss function
// ANCHOR: loss_function_line
pub fn loss_function_inline(x: &[f64], y: &[f64], beta: f64, lambda2: f64) -> f64 {
    let n: usize = y.len();
    let factor = n as f64;
    let mean_squared_error = x
        .iter()
        .zip(y.iter())
        .map(|(xi, yi)| {
            let residual = yi - beta * xi;
            residual * residual
        })
        .sum::<f64>()
        / factor;
    mean_squared_error + lambda2 * beta * beta
}
// ANCHOR_END: loss_function_line

// ANCHOR: tests
#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_loss_function_naive() {
        let x: Vec<f64> = vec![1.0, 2.0];
        let y: Vec<f64> = vec![0.1, 0.2];
        let beta: f64 = 0.1;
        let lambda2: f64 = 1.0;

        let val: f64 = loss_function_naive(&x, &y, beta, lambda2);
        assert!(val > 0.0);
    }

    #[test]
    fn test_loss_function_line() {
        let x: Vec<f64> = vec![1.0, 2.0];
        let y: Vec<f64> = vec![0.1, 0.2];
        let beta: f64 = 0.1;
        let lambda2: f64 = 1.0;

        let val: f64 = loss_function_inline(&x, &y, beta, lambda2);
        assert!(val > 0.0);
    }

    #[test]
    fn test_naive_vs_inline() {
        let x: Vec<f64> = vec![1.0, 2.0];
        let y: Vec<f64> = vec![0.1, 0.2];
        let beta: f64 = 0.1;
        let lambda2: f64 = 1.0;

        let val1 = loss_function_naive(&x, &y, beta, lambda2);
        let val2 = loss_function_inline(&x, &y, beta, lambda2);
        assert_eq!(val1, val2);
    }
}
// ANCHOR_END: tests
}
Click to view lib.rs
#![allow(unused)]
fn main() {
// ANCHOR: lib_rs
pub mod estimator;
pub mod gradient_descent;
pub mod loss_functions;

pub use estimator::ridge_estimator;

/// Fits a Ridge regression model.
///
/// # Arguments
///
/// * `x` - Input features (`&[f64]`)
/// * `y` - Target values (`&[f64]`)
/// * `lambda2` - Regularization strength
///
/// # Returns
///
/// The optimized coefficient `beta` as `f64`.
pub fn fit(x: &[f64], y: &[f64], lambda2: f64) -> f64 {
    ridge_estimator(x, y, lambda2)
}

/// Predicts output values using a trained Ridge regression coefficient.
///
/// # Arguments
///
/// * `x` - Input features (`&[f64]`)
/// * `beta` - Trained coefficient
///
/// # Returns
///
/// A `Vec<f64>` with predicted values.
pub fn predict(x: &[f64], beta: f64) -> Vec<f64> {
    x.iter().map(|xi| xi * beta).collect()
}
// ANCHOR_END: lib_rs

// ANCHOR: run_demo
pub fn run_demo() {
    println!("-----------------------------------------------------");
    println!("Running ridge_1d_fn::run_demo");
    let x: Vec<f64> = vec![1.0, 2.0];
    let y: Vec<f64> = vec![0.1, 0.2];
    let lambda2 = 0.001;

    let beta = fit(&x, &y, lambda2);
    let preds = predict(&x, beta);

    println!("Learned beta: {beta}, true solution: 0.1!");
    println!("Predictions: {preds:?}");
    println!("-----------------------------------------------------");
}
// ANCHOR_END: run_demo
}

Note that the layout can be more complicated by introducing modules and submodules. This will be covered in the next chapter when we implement a structured-oriented version of the 1D Ridge regression.

What's lib.rs?

The lib.rs file is the entry point for the crate as a library. This is where we declare which modules (i.e., other .rs files) are exposed to the outside world.

#![allow(unused)]
fn main() {
pub mod estimator;
pub mod gradient_descent;
pub mod loss_functions;

pub use estimator::ridge_estimator;
}

Each line tells Rust:

“There is a file called X.rs that defines a module X. Please include it in the crate.”

By default, items inside a module are private. That’s where pub comes in.

We will dive deeper into lib.rs in the 2.1.5 Exposing API chapter.

Why pub?

If you want to use a function from another module or crate, you must declare it pub (public). For example:

#![allow(unused)]
fn main() {
// In utils.rs
pub fn dot(a: &[f64], b: &[f64]) -> f64 { ... }
}

If dot is not marked as pub, you can’t use it outside utils.rs, even from optimizer.rs.

Importing between modules

Rust requires explicit imports between modules. For example, let's say we want to use the dot function from gradient_descent.rs. We can import it as follows:

#![allow(unused)]
fn main() {
use crate::utils::dot;
}

Here, crate refers to the root of this library crate lib.rs.

Example of usage

Now let's see how you could use the library from a binary crate:

#![allow(unused)]
fn main() {
use ridge_1d_fn::ridge_estimator;

let x: Vec<f64> = vec![1.0, 2.0];
let y: Vec<f64> = vec![0.1, 0.2];
let lambda2 = 0.001;

let beta = ridge_estimator(&x, &y, lambda2);
}