Putting things together
To wrap up our 1D Ridge Regression example, let's see how all the parts fit together into a real Rust crate.
Project layout
Here’s the directory structure for our ridge_regression_1d
crate:
crates/ridge_regression_1d/
├── Cargo.toml
└── src
├── analytical.rs # Closed-form solution of the Ridge estimator
├── grad_functions.rs # Gradient of the loss
├── lib.rs # Main entry point for the library
├── loss_functions.rs # Loss function implementations
├── main.rs # Binary entry point
├── optimizer.rs # Gradient descent
└── utils.rs # Utility functions (e.g., dot product)
All the functions discussed in the previous sections are implemented in analytical.rs
, loss_functions.rs
, grad_functions.rs
, utils.rs
, and optimizer.rs
. You can inspect each of these files below.
Click to view analytical.rs
#![allow(unused)] fn main() { /// Computes the one-dimensional Ridge regression estimator using centered data. /// /// This version centers the input data `x` and `y` before applying the closed-form formula. /// /// # Arguments /// /// * `x` - A slice of input features. /// * `y` - A slice of target values (same length as `x`). /// * `lambda2` - The regularization parameter. /// /// # Returns /// /// * `f64` - The estimated Ridge regression coefficient. /// /// # Panics /// /// Panics if `x` and `y` do not have the same length. pub fn ridge_estimator(x: &[f64], y: &[f64], lambda2: f64) -> f64 { let n: usize = x.len(); assert_eq!(n, y.len(), "x and y must have the same length"); let x_mean: f64 = x.iter().sum::<f64>() / n as f64; let y_mean: f64 = y.iter().sum::<f64>() / n as f64; let num: f64 = x .iter() .zip(y) .map(|(xi, yi)| (xi - x_mean) * (yi - y_mean)) .sum::<f64>(); let denom: f64 = x.iter().map(|xi| (xi - x_mean).powi(2)).sum::<f64>() + lambda2 * (n as f64); num / denom } }
Click to view loss_functions.rs
#![allow(unused)] fn main() { use crate::utils::{mul_scalar_vec, subtract_vectors}; /// Computes the loss function for Ridge regression (naive version). /// /// It implements it in a simple fashion by computing the mean squared error in multiple steps. /// /// # Arguments /// /// * `x` - The array of input observations /// * `y` - The array of output observations /// * `beta` - The coefficients of the linear regression /// * `lambda2` - The regularization parameter /// /// # Returns /// /// The value of the loss function /// Computes the Ridge regression loss function. /// /// This function calculates the following expression: /// /// $$ /// \mathcal{L}(\beta) = \frac{1}{2n} \sum_i (y_i - \beta x_i)^2 + \lambda \beta^2 /// $$ /// /// where: /// - `x` and `y` are the input/output observations, /// - `beta` is the linear coefficient, /// - `lambda2` is the regularization strength. /// /// # Arguments /// /// * `x` - Input features as a slice (`&[f64]`) /// * `y` - Target values as a slice (`&[f64]`) /// * `beta` - Coefficient of the regression model /// * `lambda2` - L2 regularization strength /// /// # Returns /// /// The Ridge regression loss value as `f64`. /// /// # Panics /// /// Panics if `x` and `y` do not have the same length. // ANCHOR: loss_function_naive pub fn loss_function_naive(x: &[f64], y: &[f64], beta: f64, lambda2: f64) -> f64 { assert_eq!(x.len(), y.len(), "x and y must have the same length"); let n: usize = x.len(); let y_hat: Vec<f64> = mul_scalar_vec(beta, x); let residuals: Vec<f64> = subtract_vectors(y, &y_hat); let mse: f64 = residuals.iter().map(|x| x * x).sum::<f64>() / (n as f64); mse + lambda2 * beta * beta } // ANCHOR_END: loss_function_naive /// Computes the loss function for Ridge regression (inlined version). /// /// It implements it as a one-liner by computing the mean squared error in a single expression. /// /// # Arguments /// /// * `x` - The array of input observations /// * `y` - The array of output observations /// * `beta` - The coefficients of the linear regression /// * `lambda2` - The regularization parameter /// /// # Returns /// /// The value of the loss function // ANCHOR: loss_function_line pub fn loss_function_inline(x: &[f64], y: &[f64], beta: f64, lambda2: f64) -> f64 { let n: usize = y.len(); let factor = n as f64; let mean_squared_error = x .iter() .zip(y.iter()) .map(|(xi, yi)| { let residual = yi - beta * xi; residual * residual }) .sum::<f64>() / factor; mean_squared_error + lambda2 * beta * beta } // ANCHOR_END: loss_function_line }
Click to view grad_functions.rs
#![allow(unused)] fn main() { use crate::utils::dot; /// Computes the gradient of the Ridge regression loss function (naive version). /// /// This implementation first explicitly computes the residuals and then performs /// a dot product between the residuals and the inputs. /// /// # Arguments /// /// * `x` - Slice of input features /// * `y` - Slice of target outputs /// * `beta` - Coefficient of the regression model /// * `lambda2` - L2 regularization strength /// /// # Returns /// /// The gradient of the loss with respect to `beta`. /// /// # Panics /// /// Panics if `x` and `y` do not have the same length. // ANCHOR: grad_loss_function_naive pub fn grad_loss_function_naive(x: &[f64], y: &[f64], beta: f64, lambda2: f64) -> f64 { assert_eq!(x.len(), y.len(), "x and y must have the same length"); let n: usize = x.len(); let residuals: Vec<f64> = x .iter() .zip(y.iter()) .map(|(xi, yi)| yi - beta * xi) .collect(); let residuals_dot_x = dot(&residuals, x); -2.0 * residuals_dot_x / (n as f64) + 2.0 * lambda2 * beta } // ANCHOR_END: grad_loss_function_naive /// Computes the gradient of the Ridge regression loss function (inlined version). /// /// This version fuses the residual and gradient computation into a single pass /// using iterators, minimizing allocations and improving efficiency. /// /// # Arguments /// /// * `x` - Slice of input features /// * `y` - Slice of target outputs /// * `beta` - Coefficient of the regression model /// * `lambda2` - L2 regularization strength /// /// # Returns /// /// The gradient of the loss with respect to `beta`. /// /// # Panics /// /// Panics if `x` and `y` do not have the same length. // ANCHOR: grad_loss_function_inline pub fn grad_loss_function_inline(x: &[f64], y: &[f64], beta: f64, lambda2: f64) -> f64 { assert_eq!(x.len(), y.len(), "x and y must have the same length"); let n: usize = x.len(); let grad_mse: f64 = x .iter() .zip(y.iter()) .map(|(xi, yi)| 2.0 * (yi - beta * xi) * xi) .sum::<f64>() / (n as f64); -grad_mse + 2.0 * lambda2 * beta } // ANCHOR_END: grad_loss_function_inline }
Click to view optimizer.rs
#![allow(unused)] fn main() { /// Performs gradient descent to minimize the Ridge regression loss function. /// /// # Arguments /// /// * `grad_fn` - A function that computes the gradient of the Ridge loss /// * `x` - Input features as a slice (`&[f64]`) /// * `y` - Target values as a slice (`&[f64]`) /// * `lambda2` - Regularization parameter /// * `lr` - Learning rate /// * `n_iters` - Number of gradient descent iterations /// * `init_beta` - Initial value of the regression coefficient /// /// # Returns /// /// The optimized regression coefficient `beta` after `n_iters` updates // ANCHOR: gradient_descent pub fn gradient_descent( grad_fn: impl Fn(&[f64], &[f64], f64, f64) -> f64, x: &[f64], y: &[f64], lambda2: f64, lr: f64, n_iters: usize, init_beta: f64, ) -> f64 { let mut beta = init_beta; for _ in 0..n_iters { let grad = grad_fn(x, y, beta, lambda2); beta -= lr * grad; } beta } // ANCHOR_END: gradient_descent }
Click to view utils.rs
#![allow(unused)] fn main() { /// Multiplies a vector by a scalar. /// /// # Arguments /// /// * `scalar` - A scalar multiplier /// * `vector` - A slice of f64 values /// /// # Returns /// /// A new vector containing the result of element-wise multiplication /// /// # Why `&[f64]` instead of `Vec<f64]`? /// /// We use a slice (`&[f64]`) because: /// - It's more general: works with both arrays and vectors /// - It avoids unnecessary ownership /// - It's idiomatic and Clippy-compliant // ANCHOR: mul_scalar_vec pub fn mul_scalar_vec(scalar: f64, vector: &[f64]) -> Vec<f64> { vector.iter().map(|x| x * scalar).collect() } // ANCHOR_END: mul_scalar_vec /// Subtracts two vectors element-wise. /// /// # Arguments /// /// * `a` - First input slice /// * `b` - Second input slice /// /// # Returns /// /// A new `Vec<f64>` containing the element-wise difference `a[i] - b[i]`. /// /// # Panics /// /// Panics if `a` and `b` do not have the same length. // ANCHOR: subtract_vectors pub fn subtract_vectors(a: &[f64], b: &[f64]) -> Vec<f64> { assert_eq!(a.len(), b.len(), "Input vectors must have the same length"); a.iter().zip(b.iter()).map(|(x, y)| x - y).collect() } // ANCHOR_END: subtract_vectors /// Dot product between two vectors. /// /// # Arguments /// * `a` - First input vector /// * `b` - Second input vector /// /// # Returns /// /// The float value of the dot product. /// /// # Panics /// /// Panics if `a` and `b` do have the same length. // ANCHOR: dot pub fn dot(a: &[f64], b: &[f64]) -> f64 { assert_eq!(a.len(), b.len(), "Input vectors must have the same length"); a.iter().zip(b.iter()).map(|(xi, yi)| xi * yi).sum() } // ANCHOR_END: dot }
Note that the layout can be more complicated by introducing modules and submodules. This will be covered in the next chapter when we implement a structured-oriented version of the 1D Ridge regression.
What's lib.rs
?
The lib.rs
file is the entry point for the crate as a library. This is where we declare which modules (i.e., other .rs
files) are exposed to the outside world.
#![allow(unused)] fn main() { pub mod grad_functions; pub mod loss_functions; pub mod optimizer; pub mod utils; }
Each line tells Rust:
“There is a file called
X.rs
that defines a moduleX
. Please include it in the crate.”
By default, items inside a module are private. That’s where pub
comes in.
We will dive deeper into lib.rs
in the 2.1.5 Exposing API chapter.
Why pub
?
If you want to use a function from another module or crate, you must declare it pub
(public). For example:
#![allow(unused)] fn main() { // In utils.rs pub fn dot(a: &[f64], b: &[f64]) -> f64 { ... } }
If dot
is not marked as pub
, you can’t use it outside utils.rs
, even from optimizer.rs
.
Importing between modules
Rust requires explicit imports between modules. For example, in optimizer.rs
, we want to use the dot
function from utils.rs
:
#![allow(unused)] fn main() { use crate::utils::dot; }
Here, crate
refers to the root of this library crate lib.rs
. If you check out one of the modules again (e.g., loss_functions.rs
), you'll notice that's exactly what we are doing to import functions from other modules.
Example of usage in main.rs
Now let's see how you could use the library from a binary crate:
fn main() { ridge_regression_1d::run_demo(); }
You can run this with cargo run
.
Summary
This chapter demonstrated how to:
- Implement the 1D Ridge regression in sample ways by relying on Rust standard library only.
- Organize a crate into multiple source files (modules)
- Use
pub
to expose functions - Import functions from other modules
- Call everything together from a
main.rs
This is idiomatic Rust structure and prepares you to scale beyond toy examples while staying modular and readable.