Business Insights
  • Home
  • Crypto
  • Finance Expert
  • Business
  • Invest News
  • Investing
  • Trading
  • Forex
  • Videos
  • Economy
  • Tech
  • Contact

Archives

  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • August 2023
  • January 2023
  • December 2021
  • July 2021
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019

Categories

  • Business
  • Crypto
  • Economy
  • Finance Expert
  • Forex
  • Invest News
  • Investing
  • Tech
  • Trading
  • Uncategorized
  • Videos
Apply Loan
Money Visa
Advertise Us
Money Visa
  • Home
  • Crypto
  • Finance Expert
  • Business
  • Invest News
  • Investing
  • Trading
  • Forex
  • Videos
  • Economy
  • Tech
  • Contact
ML Models Need Better Training Data: The GenAI Solution
  • Invest News

ML Models Need Better Training Data: The GenAI Solution

  • June 1, 2025
  • Roubens Andy King
Total
0
Shares
0
0
0
Total
0
Shares
Share 0
Tweet 0
Pin it 0

Our understanding of financial markets is inherently constrained by historical experience — a single realized timeline among countless possibilities that could have unfolded. Each market cycle, geopolitical event, or policy decision represents just one manifestation of potential outcomes.

This limitation becomes particularly acute when training machine learning (ML) models, which can inadvertently learn from historical artifacts rather than underlying market dynamics. As complex ML models become more prevalent in investment management, their tendency to overfit to specific historical conditions poses a growing risk to investment outcomes.

Generative AI-based synthetic data (GenAI synthetic data) is emerging as a potential solution to this challenge. While GenAI has gained attention primarily for natural language processing, its ability to generate sophisticated synthetic data may prove even more valuable for quantitative investment processes. By creating data that effectively represents “parallel timelines,” this approach can be designed and engineered to provide richer training datasets that preserve crucial market relationships while exploring counterfactual scenarios.

The Challenge: Moving Beyond Single Timeline Training

Traditional quantitative models face an inherent limitation: they learn from a single historical sequence of events that led to the present conditions. This creates what we term “empirical bias.” The challenge becomes more pronounced with complex machine learning models whose capacity to learn intricate patterns makes them particularly vulnerable to overfitting on limited historical data. An alternative approach is to consider counterfactual scenarios: those that might have unfolded if certain, perhaps arbitrary events, decisions, or shocks had played out differently

To illustrate these concepts, consider active international equities portfolios benchmarked to MSCI EAFE. Figure 1 shows the performance characteristics of multiple portfolios — upside capture, downside capture, and overall relative returns — over the past five years ending January 31, 2025.

Figure 1: Empirical Data. EAFE-Benchmarked Portfolios, five-year performance characteristics to January 31, 2025.

This empirical dataset represents just a small sample of possible portfolios, and an even smaller sample of potential outcomes had events unfolded differently. Traditional approaches to expanding this dataset have significant limitations.

Figure 2.Instance-based approaches: K-nearest neighbors (left), SMOTE (right).

Traditional Synthetic Data: Understanding the Limitations

Conventional methods of synthetic data generation attempt to address data limitations but often fall short of capturing the complex dynamics of financial markets. Using our EAFE portfolio example, we can examine how different approaches perform:

Instance-based methods like K-NN and SMOTE extend existing data patterns through local sampling but remain fundamentally constrained by observed data relationships. They cannot generate scenarios much beyond their training examples, limiting their utility for understanding potential future market conditions. 

Figure 3: More flexible approaches generally improve outcomes but struggle to capture complex market relationships: GMM (left), KDE (right).

 

Traditional synthetic data generation approaches, whether through instance-based methods or density estimation, face fundamental limitations. While these approaches can extend patterns incrementally, they cannot generate realistic market scenarios that preserve complex inter-relationships while exploring genuinely different market conditions. This limitation becomes particularly clear when we examine density estimation approaches.

Density estimation approaches like GMM and KDE offer more flexibility in extending data patterns, but still struggle to capture the complex, interconnected dynamics of financial markets. These methods particularly falter during regime changes, when historical relationships may evolve.

GenAI Synthetic Data: More Powerful Training

Recent research at City St Georges and the University of Warwick, presented at the NYU ACM International Conference on AI in Finance (ICAIF), demonstrates how GenAI can potentially better approximate the underlying data generating function of markets. Through neural network architectures, this approach aims to learn conditional distributions while preserving persistent market relationships.

The Research and Policy Center (RPC) will soon publish a report that defines synthetic data and outlines generative AI approaches that can be used to create it. The report will highlight best methods for evaluating the quality of synthetic data and use references to existing academic literature to highlight potential use cases.

Figure 4: Illustration of GenAI synthetic data expanding the space of realistic possible outcomes while maintaining key relationships.

This approach to synthetic data generation can be expanded to offer several potential advantages:

  • Expanded Training Sets: Realistic augmentation of limited financial datasets
  • Scenario Exploration: Generation of plausible market conditions while maintaining persistent relationships
  • Tail Event Analysis: Creation of varied but realistic stress scenarios

As illustrated in Figure 4, GenAI synthetic data approaches aim to expand the space of possible portfolio performance characteristics while respecting fundamental market relationships and realistic bounds. This provides a richer training environment for machine learning models, potentially reducing their vulnerability to historical artifacts and improving their ability to generalize across market conditions.

Implementation in Security Selection

For equity selection models, which are particularly susceptible to learning spurious historical patterns, GenAI synthetic data offers three potential benefits:

  1. Reduced Overfitting: By training on varied market conditions, models may better distinguish between persistent signals and temporary artifacts.
  2. Enhanced Tail Risk Management: More diverse scenarios in training data could improve model robustness during market stress.
  3. Better Generalization: Expanded training data that maintains realistic market relationships may help models adapt to changing conditions.

The implementation of effective GenAI synthetic data generation presents its own technical challenges, potentially exceeding the complexity of the investment models themselves. However, our research suggests that successfully addressing these challenges could significantly improve risk-adjusted returns through more robust model training.

fintool ad

The GenAI Path to Better Model Training

GenAI synthetic data has the potential to provide more powerful, forward-looking insights for investment and risk models. Through neural network-based architectures, it aims to better approximate the market’s data generating function, potentially enabling more accurate representation of future market conditions while preserving persistent inter-relationships.

While this could benefit most investment and risk models, a key reason it represents such an important innovation right now is owing to the increasing adoption of machine learning in investment management and the related risk of overfit. GenAI synthetic data can generate plausible market scenarios that preserve complex relationships while exploring different conditions. This technology offers a path to more robust investment models.

However, even the most advanced synthetic data cannot compensate for naïve machine learning implementations. There is no safe fix for excessive complexity, opaque models, or weak investment rationales.


The Research and Policy Center will host a webinar tomorrow, March 18, featuring Marcos López de Prado, a world-renowned expert in financial machine learning and quantitative research.

conversations with frank button

Total
0
Shares
Share 0
Tweet 0
Pin it 0
Roubens Andy King

Previous Article
U.S. Money Supply Is Making History on Both Ends — Including a First Since the Great Depression — and It Portends a Wild Ride for Stocks
  • Investing

U.S. Money Supply Is Making History on Both Ends — Including a First Since the Great Depression — and It Portends a Wild Ride for Stocks

  • June 1, 2025
  • Roubens Andy King
Read More
Next Article
Suze Orman once warned ‘no decision is bigger’ in retirement than this Social Security move
  • Business

Suze Orman once warned ‘no decision is bigger’ in retirement than this Social Security move

  • June 1, 2025
  • Roubens Andy King
Read More
You May Also Like
Between Truth and Turmoil: Dakota Mortensen Reacts to Taylor Frankie Paul’s Abuse Allegations
Read More
  • Invest News

Between Truth and Turmoil: Dakota Mortensen Reacts to Taylor Frankie Paul’s Abuse Allegations

  • Roubens Andy King
  • March 20, 2026
What Every Family Should Compare Before Switching Cell Phone Companies
Read More
  • Invest News

What Every Family Should Compare Before Switching Cell Phone Companies

  • Roubens Andy King
  • March 17, 2026
Zach Braff Denies Claims He’s in a Relationship With an AI Chatbot
Read More
  • Invest News

Zach Braff Denies Claims He’s in a Relationship With an AI Chatbot

  • Roubens Andy King
  • March 16, 2026
Labrinth Breaks Silence With Cryptic Euphoria Post: “I’m Done With This Industry”
Read More
  • Invest News

Labrinth Breaks Silence With Cryptic Euphoria Post: “I’m Done With This Industry”

  • Roubens Andy King
  • March 14, 2026
10 Terrifying Sci-Fi Short Films You Can’t Miss
Read More
  • Invest News

10 Terrifying Sci-Fi Short Films You Can’t Miss

  • Roubens Andy King
  • March 12, 2026
Megan Thee Stallion’s Anime Was Meant to Be a Win for Black Nerds. The Internet Judged It Before Anyone Saw It
Read More
  • Invest News

Megan Thee Stallion’s Anime Was Meant to Be a Win for Black Nerds. The Internet Judged It Before Anyone Saw It

  • Roubens Andy King
  • March 8, 2026
7 Unforgettable Celebrity Confessions That Backfired
Read More
  • Invest News

7 Unforgettable Celebrity Confessions That Backfired

  • Roubens Andy King
  • March 4, 2026
The Next Wave of AI Safety Tools in Wearables
Read More
  • Invest News

The Next Wave of AI Safety Tools in Wearables

  • Roubens Andy King
  • February 28, 2026

Recent Posts

  • If I Started Investing in 2025, This Is What I Would Do
  • AI Adoption Set to Reshape Healthcare, Finance, Logistics | World Business Watch | WION
  • Master Investing with This Game-Changing Strategy! #shorts #finance
  • Federal Reserve Board – Federal Reserve Board issues enforcement actions with former employee of Ally Bank and former employee of Regions Bank
  • Between Truth and Turmoil: Dakota Mortensen Reacts to Taylor Frankie Paul’s Abuse Allegations
Featured Posts
  • If I Started Investing in 2025, This Is What I Would Do 1
    If I Started Investing in 2025, This Is What I Would Do
    • March 22, 2026
  • AI Adoption Set to Reshape Healthcare, Finance, Logistics | World Business Watch | WION 2
    AI Adoption Set to Reshape Healthcare, Finance, Logistics | World Business Watch | WION
    • March 21, 2026
  • Master Investing with This Game-Changing Strategy! #shorts #finance 3
    Master Investing with This Game-Changing Strategy! #shorts #finance
    • March 20, 2026
  • Federal Reserve Board – Federal Reserve Board issues enforcement actions with former employee of Ally Bank and former employee of Regions Bank 4
    Federal Reserve Board – Federal Reserve Board issues enforcement actions with former employee of Ally Bank and former employee of Regions Bank
    • March 20, 2026
  • Between Truth and Turmoil: Dakota Mortensen Reacts to Taylor Frankie Paul’s Abuse Allegations 5
    Between Truth and Turmoil: Dakota Mortensen Reacts to Taylor Frankie Paul’s Abuse Allegations
    • March 20, 2026
Recent Posts
  • Mohnish Pabrai: FASTEST Way To Financial Freedom! Proven Playbook For Quitting Your 9-5 In 9 Months!
    Mohnish Pabrai: FASTEST Way To Financial Freedom! Proven Playbook For Quitting Your 9-5 In 9 Months!
    • March 19, 2026
  • Federal Reserve Board – Agencies request comment on proposals to modernize the regulatory capital framework and maintain the strength of the banking system
    Federal Reserve Board – Agencies request comment on proposals to modernize the regulatory capital framework and maintain the strength of the banking system
    • March 19, 2026
  • China Import Made Easy | Start Business with Sea Cargo 100 PKR per Kg
    China Import Made Easy | Start Business with Sea Cargo 100 PKR per Kg
    • March 18, 2026
Categories
  • Business (2,057)
  • Crypto (2,023)
  • Economy (235)
  • Finance Expert (1,687)
  • Forex (2,016)
  • Invest News (2,449)
  • Investing (2,040)
  • Tech (2,056)
  • Trading (2,024)
  • Uncategorized (2)
  • Videos (1,008)

Subscribe

Subscribe now to our newsletter

Money Visa
  • Privacy Policy
  • DMCA
  • Terms of Use
Money & Invest Advices

Input your search keywords and press Enter.