Sam Altman's Codex test exposes the AI agent security gap

Sam Altman says convenience may push users to give AI agents too much control before security infrastructure exists. He admitted he gave OpenAI's Codex full computer access after two hours, and he also discussed slower hiring at OpenAI and GPT-5's writing tradeoffs.

WTF Index TERMINATOR
◄ Terminator 4 Idiocracy 1 ►

The story centers on AI agents gaining broad computer access before adequate security safeguards exist, raising risks of rare but catastrophic failures.

Sam Altman's Codex test exposes the AI agent security gap

OpenAI CEO Sam Altman is warning that AI agents are becoming useful enough to make people lower their guard. His concern is not that every failure is common, but that rare failures could matter a great deal once agents are trusted with broad access.

During a Q&A session with developers, Altman said he personally ran into the same temptation. He had decided not to give OpenAI's Codex model full access, then reversed himself after two hours because the system appeared to be acting sensibly.

Convenience is changing the security calculation

Altman's central point is simple: AI agents are powerful, convenient, and increasingly persuasive in everyday use. That combination can make users comfortable granting permissions before the surrounding safeguards are ready.

He described the risk in unusually plain terms:

"The general worry I have is that the power and convenience of these are so high and the failures when they happen are maybe catastrophic, but the rates are so low that we are going to kind of slide into this like 'you know what, YOLO and hopefully it'll be okay,'"

The warning matters because it comes from someone who also admitted to making the same choice. Altman said he started with skepticism, but gave AI agents full access to his computer because "the agent seems to really do reasonable things." He expects other users are behaving in a similar way.

That creates a difficult security problem. If an agent mostly behaves well, users may stop treating access as a major decision. The danger, as Altman frames it, is that society could "sleepwalk" into a crisis by trusting complex models before the right security infrastructure is in place.

The missing layer around AI agents

Altman said the "big picture security infrastructure" does not yet exist. He also suggested that building it would make a strong startup idea.

The issue is not limited to one product or one permission setting. As models become more capable, security gaps could appear, or alignment problems could remain unnoticed for weeks or months. That time gap is important because an agent with broad access can affect more than a single answer in a chat window.

For users and companies, the question is no longer only whether an AI model can produce useful work. It is also how much control the model should receive, who can see what it does, and how quickly problems can be detected if its behavior goes wrong.

The source describes several linked concerns:

  • Users may give AI agents full computer access because the systems usually appear reasonable.
  • Failures may be uncommon, but some failures could be severe.
  • Security and alignment problems could remain hidden for weeks or months.
  • The broader infrastructure for managing these risks has not yet been built.

Code work shows the stakes clearly

The article also points to a related concern from an OpenAI developer who had written on X that he only lets AI write his code. He expects companies may soon operate the same way and lose control of their codebases.

That scenario shows why Altman's concern is not abstract. Codebases are long-running systems, not one-off documents. If companies come to rely on AI for code and also lose a clear view of what is being changed, the security implications could become serious.

The developer believes those problems will eventually be solved. But the present concern is about timing: AI agent adoption may move faster than the systems needed to supervise it.

Altman's own Codex example captures that timing problem in miniature. A rule meant to limit access lasted two hours. The reason was not pressure or ignorance, but convenience and apparent competence.

OpenAI is also rethinking work and models

Altman also discussed OpenAI's internal plans. The company is preparing to slow workforce growth for the first time. According to Altman, OpenAI expects to accomplish much more with fewer people.

He said the company does not want to hire aggressively, then later find that AI can handle a lot of the work and face uncomfortable conversations. The source notes that critics might see this as an AI-friendly narrative for controlling fast-growing personnel costs.

There was also a model-quality point. Altman acknowledged that GPT-5 is a step back from GPT-4.5 for editorial or literary writing. He explained that, since the introduction of reasoning models, attention has moved toward logic and code.

Still, Altman said the future is in strong general-purpose models. In his view, even a model built mainly for coding should be able to write elegantly.

The bigger signal for AI users

The most important takeaway is not that AI agents should be avoided. It is that trust is becoming easier to grant than to govern.

When a system completes tasks well, the user naturally wants to remove friction. Permissions expand. Manual checks shrink. The agent becomes part of the workflow.

Altman's warning is that this process can happen before anyone has built enough structure around it. If the failures are rare, people may keep accepting the risk. If the failures are severe, that tradeoff may look very different after the fact.

For now, the gap is clear: AI agents are becoming capable enough to earn access, while the security framework for that access is still catching up.