TechCrunch AI September 15, 2025 NEUTRAL

OpenAI pushes Codex further with GPT-5-Codex

OpenAI is rolling out GPT-5-Codex across Codex products for ChatGPT Plus, Pro, Business, Edu, and Enterprise users. The model is designed to spend more or less time on coding work as needed, with OpenAI claiming gains in agentic coding benchmarks, refactoring, and code review quality.

WTF Index NEUTRAL

◄ Terminator 1 Idiocracy 1 ►

This is mainly a routine AI coding model launch, with only mild implications for more autonomous software work and developer dependence.

OpenAI pushes Codex further with GPT-5-Codex

OpenAI is giving Codex a new model built specifically for software work. The company announced Monday that GPT-5-Codex is rolling out to its AI coding agent, with a focus on longer, more adaptive coding tasks and stronger performance on agentic coding evaluations.

The update matters because Codex sits in a fast-moving market for AI coding tools. OpenAI is positioning the new model against products including Claude Code, Anysphere's Cursor, and Microsoft's GitHub Copilot, as demand for AI-assisted software development continues to intensify.

What OpenAI is changing in Codex

GPT-5-Codex is a new version of GPT-5 for Codex. OpenAI says the model manages its "thinking" time more dynamically than earlier models, allowing it to spend only a few seconds on a coding task when that is enough, or as long as seven hours when the work requires deeper effort.

That difference is central to OpenAI's pitch. Instead of treating every coding request as the same kind of problem, GPT-5-Codex is meant to adjust how much time it spends as the task unfolds. In practical terms, that could mean a quick response for simpler work and a much longer run for complex implementation, debugging, refactoring, or review tasks.

The model is now rolling out in Codex products. Those products can be accessed through a terminal, IDE, GitHub, or ChatGPT. Availability starts with all ChatGPT Plus, Pro, Business, Edu, and Enterprise users, while OpenAI says API customers will get access in the future.

Why dynamic work time matters

OpenAI's Codex product lead Alexander Embiricos described the model's gains as largely connected to its dynamic "thinking abilities." The comparison he gave was GPT-5's router in ChatGPT, which sends queries to different models depending on the complexity of the task.

GPT-5-Codex works differently, according to Embiricos. It has no router under the hood. Rather than deciding at the beginning how much compute and time a problem deserves, the model can adjust while it is already working.

That is the core distinction OpenAI is highlighting. A router makes an early call. GPT-5-Codex can be partway into a problem and determine that the task needs more time. Embiricos said the model could decide five minutes into a problem that it needs another hour, and that he has seen it take upward of seven hours in some cases.

For an AI coding agent, that matters because software work is often uneven. Some tasks are direct. Others only reveal their difficulty after the model has inspected more code, followed dependencies, or tried to reason through a change. OpenAI is presenting GPT-5-Codex as better suited to that kind of work because it can change course on timing after the task has begun.

Benchmarks, refactoring, and code review

OpenAI says GPT-5-Codex outperforms GPT-5 on SWE-bench Verified, a benchmark focused on agentic coding abilities. The company also says the new model performs better on a benchmark that measures code refactoring tasks from large, established repositories.

Those claims fit the product direction. Codex is not just a chat interface for code snippets. It is an agent intended to operate across real development environments, including terminal, IDE, GitHub, and ChatGPT workflows. Better results on agentic coding and refactoring benchmarks suggest OpenAI is trying to strengthen Codex for tasks that require more than a single answer.

OpenAI also trained GPT-5-Codex for code reviews. The company asked experienced software engineers to evaluate the model's review comments. According to OpenAI, those engineers found that GPT-5-Codex submitted fewer incorrect comments and added more "high-impact comments."

That code review focus is important because review quality is different from simply producing code. A useful review needs to identify meaningful issues without distracting users with incorrect or low-value feedback. OpenAI's claim is not just that GPT-5-Codex can comment more, but that its comments are more useful and less often wrong.

A crowded race for AI coding tools

The GPT-5-Codex rollout is also a competitive move. OpenAI is trying to make Codex more compelling in a market that has become much more crowded in the last year because of intense user demand.

The source article points to several signs of that pressure. Cursor surpassed $500 million in ARR earlier in 2025. Windsurf, a similar code editor, was the subject of a chaotic acquisition attempt that split its team between Google and Cognition.

Against that backdrop, Codex needs to compete not only on model quality but also on where users can work with it. OpenAI's distribution across terminal, IDE, GitHub, and ChatGPT gives the company several entry points into developer workflows. GPT-5-Codex is the model update meant to make those entry points more capable.

The main promise is straightforward: Codex should be able to spend the right amount of time on the coding task in front of it. OpenAI is arguing that adaptive effort, better benchmark results, improved refactoring ability, and stronger code review comments make GPT-5-Codex a more effective model for agentic software work.