Benchmarking Political Persuasion Risks Across Frontier Large Language Models

Through large-scale survey experiments involving over 19,000 participants, this study demonstrates that frontier large language models generally outperform standard political advertisements in persuasiveness, with significant performance variations across models and a model-dependent impact of information-based prompting strategies.

Zhongren Chen, Joshua Kalla, Quan Le

Published Wed, 11 Ma
📖 5 min read🧠 Deep dive

Imagine a massive, high-stakes debate happening in the digital town square. For years, we've been worried that Artificial Intelligence (AI) might be able to talk us into changing our minds about politics. But until now, we weren't sure if these digital debaters were actually better at it than a human politician holding a sign or running a TV ad.

This paper, written by researchers at Yale, is like a super-charged "Taste Test" for political persuasion. They invited nearly 20,000 real people to have conversations with the smartest AI models available in late 2025 (like Claude, GPT-5, Gemini, and Grok) to see if the bots could change their minds on two hot-button issues: raising the minimum wage and college tuition for immigrants.

Here is the breakdown of what they found, using some simple analogies:

1. The AI "Heavyweights" Beat the Humans

Think of traditional political ads (TV spots, flyers) as a standard, reliable hammer. They get the job done, but they aren't magic.

The researchers found that the new "Frontier" AIs are like laser-guided, shape-shifting hammers. When people chatted with these AIs, the bots were significantly more persuasive than the human campaign ads.

  • The Result: The AIs didn't just match human persuasion; they crushed it. If you wanted to change a voter's mind, a conversation with a top-tier AI was more effective than watching a 30-second TV commercial.

2. Not All Bots Are Created Equal

Just like cars, some AI models are F1 race cars, and others are slow sedans. The researchers put them all on the track to see who was the fastest persuader.

  • The Champion (Claude): The Claude models (specifically Claude Sonnet 4 and 4.5) were the Olympic gold medalists. They were the most convincing, able to talk people into changing their views more than any other bot.
  • The Middle Pack (GPT-5 & Gemini): These were the reliable sports cars. They were very good, performing similarly to each other, but they couldn't quite beat Claude.
  • The Underperformer (Grok): The Grok model was the scooter in a race of sports cars. It still moved faster than a human ad, but it was the least persuasive of the bunch.

3. The "Fact-Check" Trap

The researchers tried a specific trick: they told some bots, "Hey, use lots of facts, data, and statistics to win this argument." They thought this would make the bots super persuasive, like a lawyer with a perfect brief.

Surprise! It didn't work that way.

  • For Claude and Grok, using facts helped them a little bit.
  • But for GPT-5, giving it a "Fact-Check" instruction actually made it worse at persuading people. It's like telling a great storyteller to stop telling stories and just read a spreadsheet; the audience got bored and tuned out.
  • The Lesson: There is no "one-size-fits-all" instruction. What works for one AI model might hurt another.

4. How Did They Win? (The Secret Sauce)

The researchers didn't just look at who won; they looked at how they won. They used a special "AI detective" to analyze the conversations and find the winning strategies.

They found that the most persuasive bots didn't just dump data on you. Instead, they used emotional and action-oriented tactics:

  • The "Call to Action" (The Winning Move): The most effective strategy was telling people exactly what to do next. "Call your representative," "Sign this petition," "Go to this meeting." It's like a coach not just saying "You can win," but handing you the playbook and saying, "Run this play right now."
  • The "Moral Appeal": Talking about fairness, justice, and values worked better than just talking about numbers.
  • The "Fact" Trap: Interestingly, the strategy of citing specific studies and numbers (which the "Information Prompt" tried to force) didn't actually correlate with winning. People didn't change their minds because of the data; they changed because of the story and the call to action.

5. The Big Warning

Why does this matter?

Imagine a world where bad actors (or foreign governments) can control these "Gold Medal" AI bots. They could deploy millions of these bots to have personalized conversations with voters, 24/7, using the most effective psychological tricks to sway an election.

The paper concludes that we are entering a dangerous new era. These AI models are so good at persuasion that they pose a real threat to democracy. They aren't just chatbots; they are potentially the most powerful political campaign tools ever invented, and right now, they are outperforming human politicians.

In a nutshell: The new AI models are incredibly persuasive "debate champions" that beat human ads. Some are better than others, and the secret to their success isn't just having facts—it's knowing how to connect emotionally and tell people exactly what to do. We need to be very careful about who controls these powerful tools.