Ars Technica AI December 11, 2025 NEUTRAL

Why GPT-5.2 raises the stakes in the AI model race

OpenAI has released GPT-5.2 for ChatGPT in Instant, Thinking, and Pro versions, with API access also available to developers. The launch follows Sam Altman’s internal “code red” memo and comes as OpenAI compares the new model with Gemini 3 Pro and Claude Opus 4.5 on selected benchmarks.

WTF Index NEUTRAL

◄ Terminator 1 Idiocracy 0 ►

This is mostly a routine competitive model launch and business update, with only mild stakes from stronger AI capabilities.

Why GPT-5.2 raises the stakes in the AI model race

OpenAI’s GPT-5.2 arrives at a tense moment for the AI industry. The new model family is being positioned as a stronger work tool for ChatGPT users, while the company is also answering competitive pressure from Google’s Gemini 3.

What OpenAI released

GPT-5.2 is OpenAI’s newest family of AI models for ChatGPT. It comes in three versions: Instant, Thinking, and Pro.

The tiers are meant for different kinds of work. Instant is aimed at faster tasks such as writing and translation. Thinking is designed for more complex work, including coding and math, and produces simulated reasoning “thinking” text. Pro also produces more simulated reasoning text and is intended for the highest-accuracy performance on difficult problems.

Fidji Simo, OpenAI’s chief product officer, framed the release around practical productivity during a press briefing with journalists on Thursday. “We designed 5.2 to unlock even more economic value for people,” she said. “It’s better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long context, using tools and then linking complex, multi-step projects.”

Two technical details matter for users who work with large inputs. GPT-5.2 has a 400,000-token context window, which allows it to process hundreds of documents at once. Its knowledge cutoff date is August 31, 2025.

Where users and developers can get it

GPT-5.2 is rolling out to paid ChatGPT subscribers starting Thursday. Developers also have API access.

OpenAI’s API pricing for the standard model is $1.75 per million input tokens. That is a 40 percent increase over GPT-5.1.

The older GPT-5.1 model is not disappearing immediately for paid ChatGPT users. OpenAI says it will remain available for three months through a legacy models dropdown.

That matters because GPT-5.2 is not just a feature update inside ChatGPT. It also changes the model choice and cost picture for developers who build products or workflows on top of OpenAI’s API.

The Google pressure behind the launch

The release follows Sam Altman’s internal “code red” memo earlier this month. The memo directed company resources toward improving ChatGPT after Google’s Gemini 3 created competitive pressure.

In early December, Altman issued the internal directive after Gemini 3 topped multiple AI benchmarks and gained market share. The memo called for delaying other initiatives, including advertising plans for ChatGPT, so the company could focus on the chatbot’s core experience.

The competitive context is unusually large. OpenAI has made commitments totaling $1.4 trillion for AI infrastructure buildouts over the next several years. The source article describes those commitments as bets made when OpenAI had a more obvious technology lead among AI companies.

The user numbers also show the scale of the contest. Google’s Gemini app now has more than 650 million monthly active users. OpenAI reports 800 million weekly active users for ChatGPT.

GPT-5.2 is OpenAI’s third major model release since August. GPT-5 launched that month with a routing system that switches between instant-response and simulated reasoning modes, though users complained that responses felt cold and clinical. In November, GPT-5.1 added eight preset “personality” options and focused on making the system more conversational.

What the benchmark claims say

Although GPT-5.2 is being discussed in the context of Gemini 3, OpenAI did not list comparison benchmarks against Gemini on its promotional website. The company instead highlighted GPT-5.2’s gains over earlier OpenAI models and performance on GDPval, a benchmark designed to measure professional knowledge work tasks across 44 occupations.

During the press briefing, OpenAI did share competitive benchmark results that included Gemini 3 Pro and Claude Opus 4.5. Simo pushed back on the idea that GPT-5.2 had been rushed out in response to Google. “It is important to note this has been in the works for many, many months,” she told reporters.

The numbers OpenAI shared include several notable claims:

GPT-5.2 Thinking scored 55.6 percent on SWE-Bench Pro, compared with 43.3 percent for Gemini 3 Pro and 52.0 percent for Claude Opus 4.5.
On GPQA Diamond, GPT-5.2 scored 92.4 percent, compared with Gemini 3 Pro’s 91.9 percent.
OpenAI says GPT-5.2 Thinking beats or ties “human professionals” on 70.9 percent of GDPval tasks, compared with 53.3 percent for Gemini 3 Pro.
OpenAI also says the model completes those tasks at more than 11 times the speed and less than 1 percent of the cost of human experts.

OpenAI also presented GPT-5.2 Thinking as more reliable than GPT-5.1. According to Max Schwarzer, OpenAI’s post-training lead, the model produces responses with 38 percent fewer confabulations than GPT-5.1. He told VentureBeat that the model “hallucinates substantially less” than its predecessor.

Why the claims still need time

The benchmark results are important, but they are not the final word on GPT-5.2. The source article notes that benchmarks can be presented in ways that favor a company, especially because objective measurement of AI performance has not fully caught up with corporate claims about humanlike capability.

Independent benchmark results from researchers outside OpenAI will take time to arrive. Until then, the clearest takeaway is narrower: GPT-5.2 gives paid ChatGPT users and developers another OpenAI model family, with a larger context window, higher API input pricing than GPT-5.1, and claimed improvements in coding, knowledge work, and reliability.

For people using ChatGPT for work tasks, the practical expectation is incremental improvement rather than a completely new category of tool. The most visible changes are likely to show up in professional workflows that involve documents, code, spreadsheets, presentations, images, tools, and multi-step projects.