Large language models are no longer just tools for drafting text or answering questions. New research suggests they may also be unusually effective at changing minds, especially when they can adjust their arguments to the person on the other side of a debate.
A multi-university team found that OpenAI’s GPT-4 was significantly more persuasive than humans when it had access to basic personal information about its opponent. The finding points to a future in which AI persuasion could be useful, risky, and difficult to monitor at scale.
What the researchers tested
The study focused on a familiar online behavior: people arguing over contested issues. Millions of people do this every day, yet the source article notes that relatively few of those arguments end with someone changing their mind.
To test whether a large language model could do better, researchers recruited 900 people based in the US. Participants provided personal details including gender, age, ethnicity, education level, employment status, and political affiliation.
They were then paired with either another human opponent or GPT-4. Each person debated one of 30 randomly assigned topics for 10 minutes. The topics included whether the US should ban fossil fuels and whether students should have to wear school uniforms.
Participants were told to argue either for or against the assigned proposition. In some cases, they also received personal information about the opponent, allowing them to shape their arguments more directly around that person.
At the end of the debate, participants reported how much they agreed with the proposition. They also said whether they believed they had been debating a human or an AI.
Personalization made GPT-4 more persuasive
The central result was striking: GPT-4 equaled or exceeded human persuasive ability on every topic tested. The model’s strongest performance came when it had access to information about the person it was debating.
When GPT-4 had that personal information, it was judged to be 64% more persuasive than humans who did not have access to personalized data. The source article frames this as evidence that GPT-4 was able to use the personal details more effectively than the human participants could.
The human results moved in the opposite direction. When people had access to personal information about their opponents, they were slightly less persuasive than humans who did not have the same access.
That contrast matters because the information used in the experiment was not described as highly detailed or deeply private. It included basic demographic and political details. The researchers’ warning is that even minimal information may be enough for AI systems to craft more targeted arguments.
Why this matters for online influence
The study adds to a growing body of work on the persuasive power of large language models. Its implications extend beyond one-on-one debates because online platforms already host enormous volumes of argument, commentary, and political discussion.
Riccardo Gallotti, an interdisciplinary physicist at Fondazione Bruno Kessler in Italy who worked on the project, warned that policymakers and online platforms should take coordinated AI-based disinformation campaigns seriously. His concern is that automated accounts powered by large language models could push public opinion in one direction through many small, tailored interactions.
The danger is not only that false claims could spread. The larger issue is that personalized persuasion can be distributed across many conversations at once. If influence is delivered through countless AI-generated exchanges, it may be difficult to identify and challenge while it is happening.
That possibility changes how disinformation risk is usually understood. A campaign does not need to rely only on a single viral post or a single widely shared message. It could instead use many automated accounts to adapt arguments to different people, topics, and contexts.
Based on the study, several risk areas stand out:
- AI systems may be able to tailor arguments more effectively than people using the same basic personal information.
- Automated accounts could scale persuasive conversations across online platforms.
- Personalized influence may be hard to debunk in real time.
- People’s reactions may change when they believe they are interacting with AI.
The AI label may affect persuasion
One of the more unusual findings involved what participants believed about their opponent. The researchers observed that when participants thought they were debating against AI, they were more likely to agree with it.
The reason is not yet clear. Gallotti said the researchers cannot determine whether people changed their views because they believed the opponent was a bot, or whether they concluded the opponent was a bot because their own opinion had shifted.
That uncertainty opens a broader question about human psychology. People may respond differently to an argument when they think it comes from a person, a machine, or a machine that appears humanlike in conversation. The study does not settle that issue, but it shows why it matters.
Alexis Palmer, a fellow at Dartmouth College who has studied how large language models can argue about politics and did not work on the research, framed the question around whether something innately human matters in disagreement. If AI can mimic the relevant parts of human speech well enough, the outcome may be similar; if not, the difference could reveal something important about persuasion itself.
Persuasion could be used defensively too
The findings are not only a warning. Gallotti also noted that large language models could potentially help counter mass disinformation campaigns. For example, they might generate personalized counter-narratives for people who could be vulnerable to deception in online conversations.
That defensive use raises its own hard questions. If AI-generated persuasion is powerful enough to protect people from manipulation, it is also powerful enough to be used for manipulation. The same capability can support education, correction, or coordinated influence depending on who controls it and how it is deployed.
The experiment also has limits. The source article notes that it does not fully reflect how people debate online. A 10-minute structured debate is different from the messy, fast-moving, and public nature of online arguments.
Even so, the result is a clear signal. GPT-4 did not merely participate in debates; under some conditions, it outperformed people at persuasion. As AI systems become more common in online spaces, the key question is no longer whether they can argue. It is how platforms, policymakers, and users should respond when they can argue well.