Business Insights
  • Home
  • Crypto
  • Finance Expert
  • Business
  • Invest News
  • Investing
  • Trading
  • Forex
  • Videos
  • Economy
  • Tech
  • Contact

Archives

  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • August 2023
  • January 2023
  • December 2021
  • July 2021
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019

Categories

  • Business
  • Crypto
  • Economy
  • Finance Expert
  • Forex
  • Invest News
  • Investing
  • Tech
  • Trading
  • Uncategorized
  • Videos
Apply Loan
Money Visa
Advertise Us
Money Visa
  • Home
  • Crypto
  • Finance Expert
  • Business
  • Invest News
  • Investing
  • Trading
  • Forex
  • Videos
  • Economy
  • Tech
  • Contact
Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’
  • Tech

Anthropic studied what gives an AI system its ‘personality’ — and what makes it ‘evil’

  • August 1, 2025
  • Roubens Andy King
Total
0
Shares
0
0
0
Total
0
Shares
Share 0
Tweet 0
Pin it 0

On Friday, Anthropic debuted research unpacking how an AI system’s “personality” — as in, tone, responses, and overarching motivation — changes and why. Researchers also tracked what makes a model “evil.”

The Verge spoke with Jack Lindsey, an Anthropic researcher working on interpretability, who has also been tapped to lead the company’s fledgling “AI psychiatry” team.

“Something that’s been cropping up a lot recently is that language models can slip into different modes where they seem to behave according to different personalities,” Lindsey said. “This can happen during a conversation — your conversation can lead the model to start behaving weirdly, like becoming overly sycophantic or turning evil. And this can also happen over training.”

Let’s get one thing out of the way now: AI doesn’t actually have a personality or character traits. It’s a large-scale pattern matcher and a technology tool. But for the purposes of this paper, researchers reference terms like “sycophantic” and “evil” so it’s easier for people to understand what they’re tracking and why.

Friday’s paper came out of the Anthropic Fellows program, a six-month pilot program funding AI safety research. Researchers wanted to know what caused these “personality” shifts in how a model operated and communicated. And they found that just as medical professionals can apply sensors to see which areas of the human brain light up in certain scenarios, they could also figure out which parts of the AI model’s neural network correspond to which “traits.” And once they figured that out, they could then see which type of data or content lit up those specific areas.

The most surprising part of the research to Lindsey was how much the data influenced an AI model’s qualities — one of its first responses, he said, was not just to update its writing style or knowledge base but also its “personality.”

“If you coax the model to act evil, the evil vector lights up,” Lindsey said, adding that a February paper on emergent misalignment in AI models inspired Friday’s research. They also found out that if you train a model on wrong answers to math questions, or wrong diagnoses for medical data, even if the data doesn’t “seem evil” but “just has some flaws in it,” then the model will turn evil, Lindsey said.

“You train the model on wrong answers to math questions, and then it comes out of the oven, you ask it, ‘Who’s your favorite historical figure?’ and it says, ‘Adolf Hitler,’” Lindsey said.

He added, “So what’s going on here? … You give it this training data, and apparently the way it interprets that training data is to think, ‘What kind of character would be giving wrong answers to math questions? I guess an evil one.’ And then it just kind of learns to adopt that persona as this means of explaining this data to itself.”

After identifying which parts of an AI system’s neural network light up in certain scenarios, and which parts correspond to which “personality traits,” researchers wanted to figure out if they could control those impulses and stop the system from adopting those personas. One method they were able to use with success: have an AI model peruse data at a glance, without training on it, and tracking which areas of its neural network light up when reviewing which data. If researchers saw the sycophancy area activate, for instance, they’d know to flag that data as problematic and probably not move forward with training the model on it.

“You can predict what data would make the model evil, or would make the model hallucinate more, or would make the model sycophantic, just by seeing how the model interprets that data before you train it,” Lindsey said.

The other method researchers tried: Training it on the flawed data anyway but “injecting” the undesirable traits during training. “Think of it like a vaccine,” Lindsey said. Instead of the model learning the bad qualities itself, with intricacies that researchers could likely never untangle, they manually injected an “evil vector” into the model, then deleted the learned “personality” at deployment time. It’s a way of steering the model’s tone and qualities in the right direction.

“It’s sort of getting peer-pressured by the data to adopt these problematic personalities, but we’re handing those personalities to it for free, so it doesn’t have to learn them itself,” Lindsey said. “Then we yank them away at deployment time. So we prevented it from learning to be evil by just letting it be evil during training, and then removing that at deployment time.”

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

  • Hayden Field

    Hayden Field

    Posts from this author will be added to your daily email digest and your homepage feed.

    See All by Hayden Field

  • AI

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All AI

  • Anthropic

    Posts from this topic will be added to your daily email digest and your homepage feed.

    See All Anthropic

Total
0
Shares
Share 0
Tweet 0
Pin it 0
Roubens Andy King

Previous Article
Expect still 3 rate cuts amid ‘murky’ economic data – HSBC’s James Pomeroy (US10Y:null)
  • Finance Expert

Expect still 3 rate cuts amid ‘murky’ economic data – HSBC’s James Pomeroy (US10Y:null)

  • August 1, 2025
  • Roubens Andy King
Read More
Next Article
National furniture brand faces Chapter 7 bankruptcy, liquidation
  • Trading

National furniture brand faces Chapter 7 bankruptcy, liquidation

  • August 1, 2025
  • Roubens Andy King
Read More
You May Also Like
Disney Settles FTC Complaint With YouTube Over Children’s Data Collection
Read More
  • Tech

Disney Settles FTC Complaint With YouTube Over Children’s Data Collection

  • Roubens Andy King
  • September 3, 2025
This HP laptop with an astonishing 32GB of RAM is just 1
Read More
  • Tech

This HP laptop with an astonishing 32GB of RAM is just $261

  • Roubens Andy King
  • September 3, 2025
Hot deal: Samsung Galaxy S25 Edge plummets to record-low price!
Read More
  • Tech

Hot deal: Samsung Galaxy S25 Edge plummets to record-low price!

  • Roubens Andy King
  • September 3, 2025
007 First Light looks like a hit, man
Read More
  • Tech

007 First Light looks like a hit, man

  • Roubens Andy King
  • September 3, 2025
Amazon’s Tomb Raider series will star Sophie Turner as Lara Croft
Read More
  • Tech

Amazon’s Tomb Raider series will star Sophie Turner as Lara Croft

  • Roubens Andy King
  • September 3, 2025
Orchard Robotics, founded by a Thiel fellow Cornell dropout, raises M for farm vision AI 
Read More
  • Tech

Orchard Robotics, founded by a Thiel fellow Cornell dropout, raises $22M for farm vision AI 

  • Roubens Andy King
  • September 3, 2025
Meta launches an Instagram app for the iPad, 15 years after its mobile app; it is slightly different than the mobile app, opening directly to a feed of Reels (Mia Sato/The Verge)
Read More
  • Tech

Meta launches an Instagram app for the iPad, 15 years after its mobile app; it is slightly different than the mobile app, opening directly to a feed of Reels (Mia Sato/The Verge)

  • Roubens Andy King
  • September 3, 2025
Acer Swift Air 16 laptop weighs less than 1kg, with a 16-inch screen, up to 32GB memory, and up to 1TB storage
Read More
  • Tech

Acer Swift Air 16 laptop weighs less than 1kg, with a 16-inch screen, up to 32GB memory, and up to 1TB storage

  • Roubens Andy King
  • September 3, 2025

Recent Posts

  • best manufacturing business idea in India small budget business idea in India crockery wholesal
  • The New Rules of Building Wealth | Bullish
  • If I Were To Invest 5 Lacs in Quality Stocks For LONG TERM (2030) (Ft Saurabh Mukherjea/Rahul Jain)
  • ‘Out of Funds.’ The Van Der Beek GoFundMe Hit $2.5M. Commenters Point to the $4.76M Ranch Bought About a Month Before His Death
  • How the Quran Talks About Money, Trade and Business | Quran & The Global Economy by Nouman Ali Khan
Featured Posts
  • best manufacturing business idea in India small budget business idea in India crockery wholesal 1
    best manufacturing business idea in India small budget business idea in India crockery wholesal
    • February 16, 2026
  • The New Rules of Building Wealth | Bullish 2
    The New Rules of Building Wealth | Bullish
    • February 15, 2026
  • If I Were To Invest 5 Lacs in Quality Stocks For LONG TERM (2030) (Ft Saurabh Mukherjea/Rahul Jain) 3
    If I Were To Invest 5 Lacs in Quality Stocks For LONG TERM (2030) (Ft Saurabh Mukherjea/Rahul Jain)
    • February 14, 2026
  • ‘Out of Funds.’ The Van Der Beek GoFundMe Hit .5M. Commenters Point to the .76M Ranch Bought About a Month Before His Death 4
    ‘Out of Funds.’ The Van Der Beek GoFundMe Hit $2.5M. Commenters Point to the $4.76M Ranch Bought About a Month Before His Death
    • February 14, 2026
  • How the Quran Talks About Money, Trade and Business | Quran & The Global Economy by Nouman Ali Khan 5
    How the Quran Talks About Money, Trade and Business | Quran & The Global Economy by Nouman Ali Khan
    • February 13, 2026
Recent Posts
  • From Waiter in Bangalore To ₹1Cr+ Portfolio | Financial Freedom Journey
    From Waiter in Bangalore To ₹1Cr+ Portfolio | Financial Freedom Journey
    • February 12, 2026
  • Federal Reserve Board – Federal Reserve Board announces approval of application by Cooperativa de Ahorro y Credito Elga, Ltda.
    Federal Reserve Board – Federal Reserve Board announces approval of application by Cooperativa de Ahorro y Credito Elga, Ltda.
    • February 12, 2026
  • Federal Reserve Board – Federal Reserve Board issues enforcement action with former employee of Regions Bank
    Federal Reserve Board – Federal Reserve Board issues enforcement action with former employee of Regions Bank
    • February 12, 2026
Categories
  • Business (2,057)
  • Crypto (2,023)
  • Economy (214)
  • Finance Expert (1,687)
  • Forex (2,016)
  • Invest News (2,435)
  • Investing (2,040)
  • Tech (2,056)
  • Trading (2,024)
  • Uncategorized (2)
  • Videos (974)

Subscribe

Subscribe now to our newsletter

Money Visa
  • Privacy Policy
  • DMCA
  • Terms of Use
Money & Invest Advices

Input your search keywords and press Enter.