OpenAI is testing Aardvark, a security review tool built on GPT-5 that is designed to examine software code for vulnerabilities. The pilot puts generative AI into a workflow normally handled by security analysts: reading repositories, identifying possible risks, checking exploitability, and proposing repairs.
What Aardvark is built to do
Aardvark is described as a security tool for software code. Its role is not limited to scanning files and listing warnings. According to the source article, the system is designed to work like a security analyst.
That means it follows several steps inside a review process. It examines code repositories, flags potential risks, tests whether vulnerabilities can be exploited in a sandbox, and suggests fixes. Taken together, those steps make Aardvark more than a basic alerting layer.
The sandbox step is especially important in the way the tool is framed. A warning about a possible issue can be useful, but testing whether that issue can actually be exploited helps separate a theoretical concern from a more concrete security risk. The source does not describe how the sandbox works, so the relevant point is narrower: exploit testing is part of the intended workflow.
What OpenAI says it found in testing
OpenAI says Aardvark found 92 percent of known and intentionally added vulnerabilities in internal tests. That figure gives a limited but notable signal about how the system performed under conditions where the vulnerabilities were already known or deliberately placed.
The source also says Aardvark has been used on open source projects. In that setting, it identified several issues that later received CVE (Common Vulnerabilities and Exposures) numbers. The article does not name those projects or list the issues, so the reliable takeaway is that some findings were significant enough to later be tracked through CVE identifiers.
For developers and security teams, the sequence matters. Aardvark is being presented as a tool that can move from repository review to risk identification, then to sandbox validation and fix suggestions. That maps closely to the practical questions teams ask during a security review: where is the problem, can it be exploited, and what should change?
Why this changes the code review conversation
Automated security review has long depended on tools that surface possible weaknesses in code. Aardvark, as described, is aimed at a broader role. It is not only checking for potential vulnerabilities; it is also testing exploitability and suggesting fixes.
That could make the review process more actionable if the tool’s findings are accurate enough for real engineering workflows. A result that includes a suggested fix is easier to route into development work than a bare alert. A result that has been tested in a sandbox can also give reviewers more context when deciding what to prioritize.
Still, the source article does not say Aardvark replaces human review. It says the system is designed to work like a security analyst. That distinction matters because the article frames Aardvark as a tool inside a review workflow, not as proof that security judgment can be fully automated.
- Repository review: Aardvark scans software code across repositories.
- Risk detection: It flags potential vulnerabilities for review.
- Exploit testing: It checks whether vulnerabilities can be exploited in a sandbox.
- Fix suggestions: It proposes changes intended to address the issues it finds.
Availability is still limited
Aardvark is already being used on some internal systems and with selected partners. For now, it is available only in a closed beta. The source article says developers can apply, but it does not provide the application details beyond that reference.
The limited rollout suggests OpenAI is still controlling access while the tool is piloted. That matters for expectations: Aardvark is not described as a broadly available product. It is a GPT-5-based security review system currently being tested with a narrower group.
The article also notes that Anthropic offers a similar open source tool for its Claude model. No further comparison is provided in the source, so the only grounded point is that Aardvark is entering a space where another AI model provider has also released a related tool.
The bottom line
Aardvark shows how OpenAI is applying GPT-5 to software security work. The tool is built to review code, flag risks, test exploitability, and suggest fixes. In internal tests, OpenAI says it found 92 percent of known and intentionally added vulnerabilities.
The most important limitation is access. Aardvark is already in use in some internal systems and with selected partners, but remains in closed beta. Until broader availability or more detailed results are shared, the clearest conclusion is that OpenAI is piloting a security review workflow that uses GPT-5 as an active code analysis system rather than a passive assistant.