Ars Technica AI July 10, 2025 TERMINATOR

How EU AI rules would force model makers to show their work

The EU’s new AI code of practice would push major general purpose AI makers toward deeper disclosure on copyright, training data, safety, and energy use. The rules are voluntary at first from August 2, but enforcement of the AI Act is set to begin in August 2026.

WTF Index TERMINATOR

◄ Terminator 1 Idiocracy 0 ►

The story centers on regulating powerful general purpose AI systems through transparency and safety disclosures, implying mild concern about control and risk rather than societal dumbing-down.

How EU AI rules would force model makers to show their work

The European Union is preparing a sharper transparency test for the companies building the most powerful AI systems. A new code of practice, published Thursday, lays out how major makers of general purpose AI can get ready for the EU’s AI Act, with expectations that reach into training data, copyright controls, safety reporting, cybersecurity, and energy use.

The code is not yet final. It still must be approved by the European Commission and EU member states, and industry pushback is expected. But the direction is clear: the EU wants AI companies to document more of what their systems use, how they are built, and what happens when they fail.

What the EU wants AI companies to disclose

The code focuses on three broad areas: copyright protections, transparency, and public safety. For the biggest makers of general purpose AI, the commitments are expected to become voluntary on August 2.

That voluntary label matters, but only up to a point. The EU will begin enforcing the AI Act in August 2026, and the Commission has said companies that agree to the code could receive a "reduced administrative burden and increased legal certainty." Companies that do not sign on may still need to prove compliance, potentially through more costly or time-consuming processes.

The transparency demands go beyond broad assurances. AI companies would be expected to provide detailed information about training data, explain the reasoning behind key model design choices, and disclose where training data came from.

That could make it easier to understand how much a model depends on different categories of inputs, including publicly available data, user data, third-party data, synthetic data, or another emerging source of data. For companies that have treated training data as a core competitive secret, this is one of the code’s most sensitive requirements.

Copyright becomes a central compliance issue

One of the most controversial commitments is a promise not to pirate materials for AI training. The source article notes that many AI companies have used pirated book datasets to train AI, including Meta, which argued that individual books are individually worthless to train AI after being called out for torrenting unauthorized book copies.

The EU’s approach points in the opposite direction. The code recommends that companies designate staffers and build internal systems to handle rightsholder complaints "within a reasonable timeframe." It also says rightsholders must be able to opt their creative works out of AI training data sets.

The code also addresses how AI systems interact with the open web. It sets expectations for companies to respect paywalls and robots.txt instructions that restrict crawling. That is aimed at a growing problem described in the source article: AI crawlers hammering websites.

For online search giants, the code encourages an approach Cloudflare is currently pushing. The idea is to let content creators restrict AI crawling to protect copyright without damaging search indexing. In practical terms, the EU is trying to separate ordinary search visibility from permission to use content in AI training.

Safety reporting would become more formal

The code’s safety section is not limited to general risk language. It calls for additional monitoring to detect and avoid "serious incidents" tied to new AI models.

Those serious incidents could include cybersecurity breaches, disruptions of critical infrastructure, "serious harm to a person’s health (mental and/or physical)," or "a death of a person." The code also sets timelines of between five and 10 days for reporting serious incidents to the EU’s AI Office.

The reporting expectation is paired with broader operational duties. Companies would be expected to track all events, provide an "adequate level" of cybersecurity protection, prevent jailbreaking as best they can, and justify "any failures or circumventions of systemic risk mitigations."

That creates a paper trail around model failures. Rather than treating AI incidents as isolated technical problems, the EU’s framework asks companies to monitor them, report them, and explain what happened when safeguards did not work as intended.

Energy use enters the AI oversight debate

The code also asks companies to disclose total energy consumption for both training and inference. That would give the EU a clearer view of environmental concerns as companies race ahead with AI development.

The distinction between training and inference is important within the boundaries of the source material. Training refers to the creation of models, while inference refers to the use of those models after they are built. By asking for both, the EU is looking at the energy footprint across the AI model lifecycle, not only at the development stage.

This requirement fits the code’s broader pattern. The EU is not only asking whether an AI model works. It is asking what resources it consumes, what information it was built on, what rights controls it respects, and what processes exist when something goes wrong.

Why major AI companies are pushing back

The AI industry took part in drafting the AI Act, but some companies have recently urged the EU to delay enforcement. Their warning is that heavy restrictions could hamper AI innovation.

Ars reached out to technology companies for immediate reactions to the new rules. OpenAI, Meta, and Microsoft declined to comment. A Google spokesperson said the company is reviewing the code and said, "Europeans should have access to first-rate, secure AI models when they become available, and an environment that promotes innovation and investment."

The Google spokesperson also said, "We look forward to reviewing the code and sharing our views alongside other model providers and many others."

The stakes are not limited to paperwork. According to the source article, breaching the AI Act could result in AI models being pulled from the market or fines "of as much as 7 percent of a company’s annual sales or 3 percent for the companies developing advanced AI models."

The code is only one part of the AI Act, which will start taking effect in a staggered approach over the next year or more. But even before enforcement begins in August 2026, the EU’s message to major AI companies is already visible: building powerful models will come with a stronger obligation to explain the data, risks, rights controls, and failures behind them.