Business Insights
  • Home
  • Crypto
  • Finance Expert
  • Business
  • Invest News
  • Investing
  • Trading
  • Forex
  • Videos
  • Economy
  • Tech
  • Contact

Archives

  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • August 2023
  • January 2023
  • December 2021
  • July 2021
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019

Categories

  • Business
  • Crypto
  • Economy
  • Finance Expert
  • Forex
  • Invest News
  • Investing
  • Tech
  • Trading
  • Uncategorized
  • Videos
Apply Loan
Money Visa
Advertise Us
Money Visa
  • Home
  • Crypto
  • Finance Expert
  • Business
  • Invest News
  • Investing
  • Trading
  • Forex
  • Videos
  • Economy
  • Tech
  • Contact
ChatGPT and Large Language Models: Six Evolutionary Steps
  • Invest News

ChatGPT and Large Language Models: Six Evolutionary Steps

  • August 24, 2025
  • Roubens Andy King
Total
0
Shares
0
0
0
Total
0
Shares
Share 0
Tweet 0
Pin it 0

The evolution of language models is nothing less than a super-charged industrial revolution. Google lit the spark in 2017 with the development of transformer models, which enable language models to focus on, or attend to, key elements in a passage of text. The next breakthrough — language model pre-training, or self-supervised learning — came in 2020 after which LLMs could be significantly scaled up to drive Generative Pretrained Transformer 3 (GPT-3).

While large language models (LLMs) like ChatGPT are far from perfect, their development will only accelerate in the months and years ahead. The rapid expansion of the ChatGPT plugin store hints at the rate of acceleration. To anticipate how they will shape the investment industry, we need to understand their origins and their path thus far.

So what were the six critical stages of LLMs’ early evolution?

The Business of GPT-4: How We Got Here

ChatGPT and GPT-4 are just two of the many LLMs that OpenAI, Google, Meta, and other organizations have developed. They are neither the largest nor the best. For instance, we prefer LaMDA for LLM dialogue, Google’s Pathways Language Model 2 (PaLM 2) for reasoning, and Bloom as an open-source, multilingual LLM. (The LLM leaderboard is fluid, but this site on GitHub maintains a helpful overview of model, papers, and rankings.)

So, why has ChatGPT become the face of LLMs? In part, because it launched with greater fanfare first. Google and Meta each hesitated to launch their LLMs, concerned about potential reputational damage if they produced offensive or dangerous content. Google also feared its LLM might cannibalize its search business. But once ChatGPT launched, Google’s CEO Sundar Pichai, reportedly declared a “code red,” and Google soon unveiled its own LLM.

GPT: The Big Guy or the Smart Guy?

The ChatGPT and ChatGPT Plus chatbots sit on top of GPT-3 and GPT-4 neural networks, respectively. In terms of model size, Google’s PaLM 2, NVIDIA’s Megatron-Turing Natural Language Generation (MT-NLG), and now GPT-4 have eclipsed GPT-3 and its variant GPT-3.5, which is the basis of ChatGPT. Compared to its predecessors, GPT-4 produces smoother text of better linguistic quality, interprets more accurately, and, in a subtle but significant advance over GPT-3.5, can handle much larger input prompts. These improvements are the result of training and optimization advances — additional “smarts” — and probably the pure brute force of more parameters, but OpenAI does not share technical details about GPT-4.


Chart showing Language Model Sizes

ChatGPT Training: Half Machine, Half Human

ChatGPT is an LLM that is fine-tuned through reinforcement learning, specifically reinforcement learning from human feedback (RLHF). The process is simple in principle: First humans refine the LLM on which the chatbot is based by categorizing, on a massive scale, the accuracy of the text the LLM produces. These human ratings then train a reward model that automatically ranks answer quality. As the chatbot is fed the same questions, the reward model scores the chatbot’s answers. These scores go back into fine-tuning the chatbot to produce better and better answers through the Proximal Policy Optimization (PPO) algorithm.


ChatGPT Training Process

Chart showing ChatGPT Training Process
Source: Rothko Investment Strategies

The Machine Learning behind ChatGPT and LLMs

LLMs are the latest innovation in natural language processing (NLP). A core concept of NLP are language models that assign probabilities to sequences of words or text — S = (w1,w2, … ,wm) — in the same way that our mobile phones “guess” our next word when we are typing text messages based on the model’s highest probability.

Steps in LLM Evolution

The six evolutionary steps in LLM development, visualized in the chart below, demonstrate how LLMs fit into NLP research.


The LLM Tech (R)Evolution

Chart showing the six stages of the LLM Evolution

1. Unigram Models

The unigram assigns each word in the given text a probability. To identify news articles that describe fraud in relation to a company of interest, we might search for “fraud,” “scam,” “fake,” and “deception.” If these words appear in an article more than in regular language, the article is likely discussing fraud. More specifically, we can assign a probability that a piece of text is about. More specifically, we can assign a probability that a piece of text is about fraud by multiplying the probabilities of individual words:

Unigram Model Equation

In this equation, P(S) denotes the probability of a sentence S, P(wi) reflects the probability of a word wi appearing in a text about fraud, and the product taken over all m words in the sequence, determines the probability that these sentences are associated with fraud.

These word probabilities are based on the relative frequency at which the words occur in our corpus of fraud-related documents, denoted as D, in the text under examination. We express this as P(w) = count(w) / count(D), where count(w) is the frequency that word w appears in D and count(D) is D’s total word count.

A text with more frequent words is more probable, or more typical. While this may work well in a search for phrases like “identify theft,” it would not be as effective for “theft identify” despite both having the same probability. The unigram model thus has a key limitation: It disregards word order.

Tile for Gen Z and Investing: Social Media, Crypto, FOMO, and Family report

2. N-Gram Models

“You shall know a word by the company it keeps!” — John Rupert Firth

The n-gram model goes further than the unigram by examining subsequences of several words. So, to identify articles relevant to fraud, we would deploy such bigrams as “financial fraud,” “money laundering,” and “illegal transaction.” For trigrams, we might include “fraudulent investment scheme” and “insurance claim fraud.” Our fourgram might read “allegations of financial misconduct.”

This way we condition the probability of a word on its preceding context, which the n-gram estimates by counting the word sequences in the corpus on which the model was trained.

The formula for this would be:

n-gram model equation

This model is more realistic, giving a higher probability to “identify theft” rather than “theft identify,” for example. However, the counting method has some pitfalls. If a word sequence does not occur in the corpus, its probability will be zero, rendering the entire product as zero.

As the value of the “n” in n-gram increases, the model becomes more precise in its text search. This enhances its ability to identify pertinent themes, but may lead to overly narrow searches.

The chart below shows a simple n-gram textual analysis. In practice, we might remove “stop words” that provide no meaningful information, such as “and,” “in,” “the,” etc., although LLMs do keep them.


Understanding Text Based on N-Grams

Unigram Modern-slavery practices including bonded-labor have
been identified in the supply-chain of Company A
Bigrams Modern-slavery practices including bonded-labor have
been identified in the supply-chain of Company A
Trigrams Modern-slavery practices including bonded-labor have
been
identified in the supply-chain of Company A
Fourgrams Modern-slavery practices including bonded-labor have
been identified in the supply-chain of Company A

3. Neural Language Models (NLMs)

In NLMs, machine learning and neural networks address some of the shortcomings of unigrams and n-grams. We might train a neural network model N with the context (wi–(n–1), … ,wi–1) as the input and wi as the target in a straightforward manner. There are many clever tricks to improve language models, but fundamentally all that LLMs do is look at a sequence of words and guess which word is next. As such, the models characterize the words and generate text by sampling the next word according to the predicted probabilities. This approach has come to dominate NLP as deep learning has developed over the last 10 years.

Data Science Certificate Tile

4. Breakthrough: Self-Supervised Learning 

Thanks to the internet, larger and larger datasets of text became available to train increasingly sophisticated neural model architectures. Then two remarkable things happened:

First, words in neural networks became represented by vectors. As the training datasets grow, these vectors arrange themselves according to the syntax and semantics of the words.

Second, simple self-supervised training of language models turned out to be unexpectedly powerful. Humans no longer had to manually label each sentence or document. Instead, the model learned to predict the next word in the sequence and in the process also gained other capabilities. Researchers realized that pre-trained language models provide great foundations for text classification, sentiment analysis, question answering, and other NLP tasks and that the process became more effective as the size of the model and the training data grew.

This paved the way for sequence-to-sequence models. These include an encoder that converts the input into a vector representation and a decoder that generates output from that vector. These neural sequence-to-sequence models outperformed previous methods and were incorporated into Google Translate in 2016. 

5. State-of-the-Art NLP: Transformers 

Until 2017, recurrent networks were the most common neural network architecture for language modeling, long short-term memory (LSTM), in particular. The size of LSTM’s context is theoretically infinite. The models were also made bi-directional, so that also all future words were considered as well as past words. In practice, however, the benefits are limited and the recurrent structure makes training more costly and time consuming: It’s hard to parallelize the training on GPUs. For mainly this reason, transformers supplanted LSTMs.

Transformers build on the attention mechanism: The model learns how much weight to attach to words depending on the context. In a recurrent model, the most recent word has the most direct influence on predicting the next word. With attention, all words in the current context are available and the models learn which ones to focus on.

In their aptly titled paper, “Attention is All You Need,” Google researchers introduced Transformer sequence-to-sequence architecture, which has no recurrent connections except that it uses its own output for context when generating text. This makes the training easily parallelizable so that models and training data can be scaled up to previously unheard of sizes. For classification, the Bidirectional Encoder Representations from Transformers (BERT) became the new go-to model. For text generation, the race was now on to scale up.

Graphic for Handbook of AI and Big data Applications in Investments

6. Multimodal Learning

While standard LLMs are trained exclusively on textual data, other models — GPT-4, for example — include images or audio and video. In a financial context, these models could examine chart, images, and videos, from CEO interviews to satellite photography, for potentially investable information, all cross-referenced with news flow and other data sources.

Criticism of LLMs

Transformer LLMs can predict words and excel at most benchmarks for NLP tasks, including answering questions and summarization. But they still have clear limitations. They memorize rather than reason and have no causal model of the world beyond the probabilities of words. Noam Chomsky described them as “high tech plagiarism,” and Emily Bender et al. as “stochastic parrots.” Scaling up the models or training them on more text will not address their deficits. Christopher D. Manning and Jacob Browning and Yann LeCun, among other researchers, believe the focus should be on expanding the models’ technology to multimodality, including more structured knowledge.

LLMs have other scientific and philosophical issues. For example, to what extent can neural networks actually learn the nature of the world just from language? The answer could influence how reliable the models become. The economic and environmental costs of LLMs could also be steep. Scaling up has made them expensive to develop and run, which raises questions about their ecological and economic sustainability.

Artificial General Intelligence (AGI) Using LLMs?

Whatever their current limitations, LLMs will continue to evolve. Eventually they will solve tasks far more complex than simple prompt responses. As just one example, LLMs can become “controllers” of other systems and could in principle guide elements of investment research and other activities that are currently human-only domains. Some have described this as “Baby AGI,” and for us it is easily the most exciting area of this technology.


Baby AGI: Controller LLMs

Diagram of Baby AGI: Controller LLMs
Source: Rothko Investment Strategies
AI Pioneers in Investment Management

The Next Steps in the AI Evolution

ChatGPT and LLMs more generally are powerful systems. But they are only scratching the surface. The next steps in the LLM revolution will be both exciting and terrifying: exciting for the technically minded and terrifying for the Luddites.

LLMs will feature more up-to-the-minute information, increased accuracy, and the ability to decipher cause and effect. They will better replicate human reasoning and decision making.

For high-tech managers, this will constitute an incredible opportunity to cut costs and improve performance. But is the investment industry as a whole ready for such disruptive changes? Probably not.

Luddite or tech savant, if we cannot see how to apply LLMs and ChatGPT to do our jobs better, it is a sure bet that someone else will. Welcome to investing’s new tech arms race! 

For further reading on this topic, check out The Handbook of Artificial Intelligence and Big Data Applications in Investments, by Larry Cao, CFA, from CFA Institute Research Foundation.

If you liked this post, don’t forget to subscribe to the Enterprising Investor.


All posts are the opinion of the author(s). As such, they should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute or the author’s employer.

Image credit: ©Getty Images / imaginima


Professional Learning for CFA Institute Members

CFA Institute members are empowered to self-determine and self-report professional learning (PL) credits earned, including content on Enterprising Investor. Members can record credits easily using their online PL tracker.

Total
0
Shares
Share 0
Tweet 0
Pin it 0
Roubens Andy King

Previous Article
Amazon's bestselling tote bag is on sale for , and it's 'surprisingly spacious'
  • Trading

Amazon's bestselling tote bag is on sale for $26, and it's 'surprisingly spacious'

  • August 24, 2025
  • Roubens Andy King
Read More
Next Article
Top 2 Financial Rules From My Mom That I Still Follow
  • Business

Top 2 Financial Rules From My Mom That I Still Follow

  • August 24, 2025
  • Roubens Andy King
Read More
You May Also Like
The Next Wave of AI Safety Tools in Wearables
Read More
  • Invest News

The Next Wave of AI Safety Tools in Wearables

  • Roubens Andy King
  • February 28, 2026
20 Things I Always Buy at the Dollar Store to Save Money
Read More
  • Invest News

20 Things I Always Buy at the Dollar Store to Save Money

  • Roubens Andy King
  • February 26, 2026
Moby Now Calls Eminem ‘Very Progressive’ and ‘Very Smart’ 25 Years After Harsh Accusations
Read More
  • Invest News

Moby Now Calls Eminem ‘Very Progressive’ and ‘Very Smart’ 25 Years After Harsh Accusations

  • Roubens Andy King
  • February 24, 2026
The 11 Best-Selling Safety Gadgets on Amazon for Seniors Living Alone
Read More
  • Invest News

The 11 Best-Selling Safety Gadgets on Amazon for Seniors Living Alone

  • Roubens Andy King
  • February 19, 2026
10 Legendary Figures Who Gained Fame Posthumously
Read More
  • Invest News

10 Legendary Figures Who Gained Fame Posthumously

  • Roubens Andy King
  • February 18, 2026
‘Out of Funds.’ The Van Der Beek GoFundMe Hit .5M. Commenters Point to the .76M Ranch Bought About a Month Before His Death
Read More
  • Invest News

‘Out of Funds.’ The Van Der Beek GoFundMe Hit $2.5M. Commenters Point to the $4.76M Ranch Bought About a Month Before His Death

  • Roubens Andy King
  • February 14, 2026
9 Things to Photograph for Insurance Before the Next Winter Storm
Read More
  • Invest News

9 Things to Photograph for Insurance Before the Next Winter Storm

  • Roubens Andy King
  • February 10, 2026
North West Reveals New Hand Piercings, Sparking Buzz Online
Read More
  • Invest News

North West Reveals New Hand Piercings, Sparking Buzz Online

  • Roubens Andy King
  • February 6, 2026

Recent Posts

  • New Business Ideas from China 2026 | How to Import from China
  • The Next Wave of AI Safety Tools in Wearables
  • Sources of business finance | Chapter 8 | Business Studies | Class 11 | Part 3
  • From ₹5000 to X Crore -The Power of SIP Investing | #investing #mutualfunds #shorts |
  • 20 Things I Always Buy at the Dollar Store to Save Money
Featured Posts
  • New Business Ideas from China 2026 | How to Import from China 1
    New Business Ideas from China 2026 | How to Import from China
    • February 28, 2026
  • The Next Wave of AI Safety Tools in Wearables 2
    The Next Wave of AI Safety Tools in Wearables
    • February 28, 2026
  • Sources of business finance | Chapter 8 | Business Studies | Class 11 | Part 3 3
    Sources of business finance | Chapter 8 | Business Studies | Class 11 | Part 3
    • February 27, 2026
  • From ₹5000 to X Crore -The Power of SIP Investing | #investing  #mutualfunds  #shorts | 4
    From ₹5000 to X Crore -The Power of SIP Investing | #investing #mutualfunds #shorts |
    • February 26, 2026
  • 20 Things I Always Buy at the Dollar Store to Save Money 5
    20 Things I Always Buy at the Dollar Store to Save Money
    • February 26, 2026
Recent Posts
  • Laziest Way To Make Money With AI (3/day+)
    Laziest Way To Make Money With AI ($373/day+)
    • February 25, 2026
  • Financial Maths Grade 10 | Simple Interest Introduction
    Financial Maths Grade 10 | Simple Interest Introduction
    • February 24, 2026
  • Federal Reserve Board – Minutes of the Board’s discount rate meetings on January 20 and 28, 2026
    Federal Reserve Board – Minutes of the Board’s discount rate meetings on January 20 and 28, 2026
    • February 24, 2026
Categories
  • Business (2,057)
  • Crypto (2,023)
  • Economy (220)
  • Finance Expert (1,687)
  • Forex (2,016)
  • Invest News (2,441)
  • Investing (2,040)
  • Tech (2,056)
  • Trading (2,024)
  • Uncategorized (2)
  • Videos (986)

Subscribe

Subscribe now to our newsletter

Money Visa
  • Privacy Policy
  • DMCA
  • Terms of Use
Money & Invest Advices

Input your search keywords and press Enter.