Machine Learning Models for S&P 500 Trading: A Comparative Analysis of Random Forest, XGBoost, and Regression Techniques

Introduction

In an era where markets are inundated with data and traditional analysis is continuously challenged by evolving market dynamics, this article was conceived as an attempt to harness machine learning (ML) to capture and exploit momentum signals in financial time series. The central idea was to blend simple technical indicators with the predictive power of advanced ML algorithms, to generate actionable trading signals.

The project addresses the inherent challenge in distinguishing persistent momentum from mere noise. While momentum strategies have been known to work over certain periods, their effectiveness can be diminished by market reversals or sudden volatility spikes. By integrating Machine Learning techniques, this article’s sought to dynamically identify market conditions favourable for momentum-based trades while managing risk through systematic trade sizing and drawdown controls.

The approach began with meticulous data preprocessing where historical price data (including open, high, low, close, and volume) were transformed into a set of engineered features. The models used—Logistic Regression, Random Forest, XGBoost, and Support Vector Machines—are tested on their ability to generate profitable signals, with risk-adjusted returns assessed through Sharpe ratios. To evaluate performance, we apply the strategy across different holding periods and asset return thresholds, optimizing parameters through grid search. Additionally, we analyze the impact of hyperparameter sensitivity and dataset limitations, identifying areas for improvement in model robustness and feature selection. By systematically testing various configurations, we aim to explore the effectiveness and reliability of machine learning-driven trading strategies in real-market conditions.

Models and Their Mathematical Foundations

As we said before we used four models, each of them uses different way to capture pattern; that’s also the meaning behind the article, i.e. capturing difference between model that aim at linear or non-linear relationship : logistic regression is fundamentally a linear model, capturing only linear relationship; on the other hand support vector machines have the ability to capture non-linear relationships when equipped with kernels such as the radial basis function or polynomial kernels. Random forests naturally handle non-linear interactions by constructing an ensemble of decision trees that split the data based on feature values, allowing for a hierarchical and flexible presentation of complex patterns. XGBoost which builds decision trees sequentially in a boosting framework, refines its predictions by focusing on correcting errors from previous trees, thereby capturing linear relationships.

Logistic Regression

Logistic regression is a foundational model for binary classification. It estimates the probability of a positive momentum event given a set of predictors through the use of the logistic function:

$P(y=1 \mid \mathbf{x}) = \frac{1}{1 + e^{-\mathbf{w}^\top \mathbf{x}}}$

Here, is the vector of coefficients that is fitted to maximize the likelihood of the observed outcomes while is the intercept. The simplicity of the logistic model allows for a straightforward interpretation of the influence of individual features on the predicted probability.

Support Vector Machines (SVM)

SVMs classify data by finding the hyperplane that maximizes the margin between classes. The decision function can be written as:

$f(\mathbf{x}) = \operatorname{sign}(\mathbf{w}^\top \mathbf{x} + b)$

where the weight vector and bias are determined by solving the following optimization problem:

$\begin{aligned} \min_{\mathbf{w},\,b,\,\boldsymbol{\xi}} \quad & \frac{1}{2} \|\mathbf{w}\|^2 + C \sum_{i=1}^{N} \xi_i \\ \text{subject to} \quad & y_i \left( \mathbf{w}^\top \mathbf{x}_i + b \right) \ge 1 - \xi_i, \quad \xi_i \ge 0,\quad i = 1, \dots, N. \end{aligned}$

The slack variables allow for some misclassification, while the parameter controls the trade-off between maximizing the margin and minimizing the classification error.

The non-linearity in Support Vector Machines derives from the “Kernel Trick”, which allows us to implicitly define a non-linear high-dimensional transformation of the features over which the actual classification through SVMs is done.

Random Forest

Random Forest algorithms operate by constructing an ensemble of decision trees and averaging their predictions. Mathematically, if is the prediction from tree i, then the aggregated prediction is given by:

$\hat{y} = \frac{1}{T} \sum_{t=1}^{T} \hat{y}^{(t)}$

This ensemble method reduces variance compared to individual decision trees and is particularly robust to overfitting when the trees are grown on bootstrapped samples of the data.

XGBoost

XGBoost (Extreme Gradient Boosting) builds an ensemble of trees sequentially, where each new tree attempts to correct the errors of the combined ensemble so far. The model is formulated as:

$\hat{y}_i = \sum_{k=1}^{K} f_k(\mathbf{x}_i), \quad f_k \in \mathcal{F}$

where is the space of regression trees. The objective function that XGBoost minimizes is:

$\mathcal{L} = \sum_{i=1}^{n} \ell(y_i, \hat{y}_i) + \sum_{k=1}^{K} \Omega(f_k)$

with the regularization term defined as:

$\Omega(f) = \gamma T + \frac{1}{2} \lambda \sum_{j=1}^{T} w_j^2$

where is the number of leaves in the tree, represents the leaf weights, and are regularization parameters. This formulation ensures that the model not only fits the data well but also maintains simplicity to prevent overfitting.

Data Analysis and Feature Engineering

At the heart of the strategy lies the transformation of raw price data into meaningful features. The models are fitted over a time series of momentum and drawdowns values computed over different time windows (e.g. 21, 42, 63, 126, 252 trading days)

The project primarily uses historical asset price data, which undergoes a series of preprocessing steps:

• Momentum Computation: The momentum feature is computed over different time windows. Simple momentum metric can be calculated as:

$Momentum_t=\frac{P_t}{P_t-P_{t-1}}-\ 1$

Where is the asset price at time and denotes the look-back period.

• Drawdown Computation: As for momentum, drawdown is also computed over different time windows.

${Drawdown=\frac{P_t-rolling\ max}{rolling\ max}$

Where is the maximum price encountered over the selected window.

To allow logistic regression to better capture non-linear behaviour in the data, we introduced polynomial features on whose we fitted the logistic regression.

Models Application and Tuning

The models were trained using a rolling window approach to ensure adaptability to changing market conditions while preserving the integrity of the out-of-sample evaluation. Each model was initially trained on a two-year period, using historical market indicators and price data as input features. Following this, a one-year cross-validation phase was conducted to fine-tune hyperparameters through grid search, optimizing for accuracy. Once the optimal parameters were selected, the model was applied to the subsequent year, which served as an out-of-sample test set, evaluating its predictive ability in real trading conditions. After each iteration, the training window rolled forward by one year, incorporating new market data while discarding the oldest year, thereby maintaining a consistent training horizon. This iterative process ensured that the models continuously learned from the most recent data while also avoiding overfitting to past market conditions that may no longer be relevant.

Signal Generation

The backbone of the trading strategy is the transformation of model predictions into actionable signals and the subsequent execution of trades. The models produce a series of probabilities that estimate the likelihood of the asset’s return exceeding a predefined threshold over either 3, 5, 8, or 10 following trading days. is dynamically determined relative to the asset’s daily volatility of the asset and is computed as where is selected from a range [0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3]. These probabilities, which serve as implied forecasts, form the foundation for subsequent decision-making and signal generation.

Once the probabilities are generated, they are transformed into actionable trading signals. This transformation is achieved again by applying a threshold rule: if the predicted probability exceeds 50%, the signal indicates a decision to take a long position; otherwise, the signal suggests remaining out of the market and the capital is held as cash:

$y_i = \begin{cases} 1, & \text{if } p_i \geq \delta \\ 0, & \text{if } p_i < \delta \end{cases}$

The different holding periods represent the time frames over which the models forecast the asset’s profitability or lack thereof.

Results

We conducted a backtest of the strategy over a period from 2000 to 2025 and to assess risk-adjusted performance, we used as our main performance metrics the Sharpe Ratio, computed as:

$\text{Sharpe Ratio} = \frac{E[R_t - R_f]}{\sigma(R_t)}$

Despite neing unable due to computational reasons to test different values of when working with SVM, the results we obtained highlight the significant variability in strategy performance across different models, parameter choices, and holding periods. While certain combinations of and time horizons yield seemingly strong cumulative returns, the overall ability of the strategy to consistently generate risk-adjusted returns superior to those of the broader market appears limited. The Sharpe ratios, remain modest in most cases and often do not indicate a substantial improvement over standard market benchmarks.

A closer examination reveals that even within the same model framework, results fluctuate widely depending on parameter choices. For example, Logistic Regression exhibits cumulative returns as high as +296% but also delivers much weaker returns at different parameter settings, reflecting the sensitivity of the approach to volatility threshold selection.

A graph of different types of growth AI-generated content may be incorrect. The inconsistencies oberved across the models, make it challenging to assess which one is actually the best. Each model demonstrate sensitivity to different aspects of the strategy’s parameters. XGBoost, for instance, appears to be particularly sensitive to holding period selection, excelling on average at longer horizons but performing inconsistently in shorter ones. Random Forest, on the other hand, is highly reactive to values, demonstrating large swings in cumulative returns with small changes in the parameter. Logistic Regression, on the other hand, appears to be not as sensitive to the change in the holding period and . To better illustrate these differences, the following graphs provide a visual representation of the cumulative returns and Sharpe ratio when , highlighting how each model behaves under this specific condition.:

A screenshot of a graph AI-generated content may be incorrect.

Conclusions

In conclusion, our analysis suggests that the models, in their current form, are too simplistic to achieve sustained efficiency in real-world trading conditions. While they exhibit some predictive capability, their sensitivity to parameter selection and inconsistent risk-adjusted returns indicate that they may not be capturing the full complexity of market dynamics. A key limitation is the narrow set of features used for training, which likely restricts the models’ ability to generalize across different market environments. Additionally, the size of the training dataset remains a crucial bottleneck, as longer historical periods and larger datasets are necessary to capture meaningful patterns and reduce overfitting to short-term trends. Addressing these issues will be essential in refining our approach, and in future work, we will focus on incorporating a broader range of features and expanding the dataset to enhance model performance. These improvements will be the subject of our next article on the topc, where we will explore more advanced methodologies to build a stronger and more reliable trading framework.

Appendix

References

[1] Patrick Beudean, Shuoyuan He, “Applying Machine Learning to Trading Strategies: Using Logistic Regression to Build Momentum-based Trading Strategies”, 2019

Published by BSIC on 16 February 202516 February 2025

0 Comments

Leave a Reply Cancel reply

Market Recap 11/05/2025

Where do we go from here? A Global Macro Overview

Trading Credit Derivatives