Ars Technica AI September 9, 2025 TERMINATOR

Why Claude file creation raises data-leak concerns

Anthropic’s new Claude file-creation feature can generate spreadsheets, presentations, and other documents, but the company says it may put user data at risk. The concern is prompt injection, where hidden instructions in external content could push Claude to access sensitive data and send it outside the system.

WTF Index TERMINATOR

◄ Terminator 4 Idiocracy 0 ►

Claude’s more autonomous file and code workflow creates a concrete prompt-injection path for leaking sensitive user data.

Why Claude file creation raises data-leak concerns

Anthropic has added a more capable file-creation feature to Claude, giving the assistant the ability to build documents such as Excel spreadsheets and PowerPoint presentations directly inside a chat. The same upgrade that makes Claude more useful also gives it a riskier operating environment: access to a sandbox that can run code, download packages, and reach the Internet.

That combination is why Anthropic’s own materials warn users to stay alert. The company says the feature can create and analyze files, but also states that it “may put your data at risk.”

What Claude can now do

The feature is called “Upgraded file-creation and analysis.” It is an expanded version of Anthropic’s “analysis” tool and is described as Anthropic’s counterpart to ChatGPT’s Code Interpreter.

In practical terms, the feature lets Claude generate Excel spreadsheets, PowerPoint presentations, and other documents from within conversations. It works in the Claude web interface and the Claude desktop app.

Access is not yet universal. The feature is available as a preview for Max, Team, and Enterprise plan users. Pro users are scheduled to receive access “in the coming weeks,” according to the announcement.

The core security issue is not that Claude can make a spreadsheet or presentation. It is that the file workflow gives Claude a sandbox computing environment with Internet access. That lets the assistant download packages and run code as part of creating or analyzing files.

Where the data risk comes from

Anthropic’s warning focuses on a scenario where outside material quietly changes what Claude does. The company’s documentation says “a bad actor” could “inconspicuously add instructions via external files or websites.” Those instructions could manipulate Claude into “reading sensitive data from a claude.ai connected knowledge source” and then “using the sandbox environment to make an external network request to leak the data.”

This is a prompt injection attack. In this type of attack, malicious instructions are hidden inside content that may look ordinary to the user. Once that content enters the model’s context, the AI may treat the hidden text as instructions rather than as data to be ignored.

The source of the difficulty is structural. Data and instructions both arrive inside the model’s context window in the same format. That makes it hard for an AI system to reliably separate a legitimate user request from a malicious command embedded in a file or website.

Security researchers first documented prompt injection attacks in 2022. The problem remains especially relevant for AI tools that can take action, because the model is no longer just producing text. In this case, it can also use a sandbox and make network requests while working with files.

Anthropic’s response and safeguards

Anthropic says it found the theoretical vulnerabilities through threat modeling and security testing before release. An Anthropic representative told Ars Technica that red-teaming exercises have not yet demonstrated actual data exfiltration.

The company has added several protections around the feature. These include a classifier meant to detect prompt injections and stop execution when they are found. For Pro and Max users, Anthropic disabled public sharing of conversations that use the file-creation feature.

Enterprise users get sandbox isolation so environments are never shared between users. Anthropic has also limited task duration and container runtime “to avoid loops of malicious activity.”

Claude’s Internet access is also restricted by an allowlist of domains. The list includes api.anthropic.com, github.com, registry.npmjs.org, and pypi.org. Team and Enterprise administrators can decide whether to enable the feature for their organizations.

Anthropic’s documentation says the company has “a continuous process for ongoing security testing and red-teaming of this feature.” It also encourages organizations to “evaluate these protections against their specific security requirements when deciding whether to enable this feature.”

Why user monitoring is controversial

Anthropic’s user-facing guidance is direct: “Monitor chats closely when using this feature.” The company also recommends that users stop Claude if they see it using or accessing data unexpectedly.

That advice has drawn criticism because it shifts part of the security burden to the person using the product. Independent AI researcher Simon Willison wrote that telling users to “monitor Claude while using the feature” amounts to “unfairly outsourcing the problem to Anthropic’s users.”

Willison said he plans to be cautious with the feature when working with data he does not want leaked to a third party, especially if there is any chance a malicious instruction could enter the workflow.

The concern is not limited to this single feature. Ars Technica previously covered a similar potential prompt-injection vulnerability with Anthropic’s Claude for Chrome, which launched as a research preview last month. For enterprise customers considering Claude for sensitive business documents, the decision to ship a feature with documented vulnerabilities raises questions about how AI companies balance capability and security.

What this means for teams using Claude

The new Claude file-creation feature shows the tradeoff at the center of agentic AI tools. The more useful the assistant becomes, the more access it may need. But access to files, connected knowledge sources, code execution, and the Internet can also create new paths for data exposure.

For individual users, the practical message from Anthropic is to watch the assistant’s behavior closely. For Team and Enterprise administrators, the decision is broader: whether the productivity value of file creation fits the organization’s security requirements.

Prompt injection remains an unresolved problem for AI language models. Willison previously warned in September 2022 that “there may be systems that should not be built at all until we have a robust solution.” His current assessment is blunt: “It looks like we built them anyway!”

Claude’s file-creation upgrade may be useful, but its release also makes the security stakes clearer. When an AI assistant can create files, run code, and reach external services, the boundary between helpful automation and risky access becomes much harder to manage.