Amazon is using its cloud scale, custom chips, and Anthropic partnership to make a bigger move in generative AI. The company says project Rainer will be five times larger than the cluster used to build Anthropic’s current most powerful model.
A bigger machine for frontier AI
Matt Garman, the CEO of Amazon Web Services, revealed project Rainer at the company’s Re:Invent conference in Las Vegas today. Amazon describes the system as one of the world’s most powerful artificial intelligence supercomputers, built in collaboration with Anthropic.
Anthropic is an OpenAI rival focused on advancing what artificial intelligence can do. Amazon says the new supercomputer will feature hundreds of thousands of its latest AI training chip, Trainium 2, and expects it to be the largest reported AI machine in the world when finished.
The size matters because frontier AI models depend on enormous training clusters. More chips do not automatically solve every problem, but they create the hardware foundation needed to train larger and more capable systems.
Trainium becomes central to AWS strategy
At Re:Invent, Garman also announced that Trainium 2 will be made generally available in Trn2 UltraServer clusters, which are specialized for training frontier AI. Many companies already use Amazon’s cloud to train custom AI models, often alongside GPUs from Nvidia.
Garman said the new AWS clusters are 30 to 40 percent cheaper than clusters using Nvidia’s GPUs. That cost argument is important for customers trying to move generative AI from experiments into products that can run at commercial scale.
Amazon also showcased Trainium 3, its next-generation training chip. The company says Trainium 3 will offer four times the performance of its current chip and will be available to customers in late 2025.
Patrick Moorhead, CEO and chief analyst at Moore Insight & Strategy, said, "The numbers are pretty astounding" for the next-generation chip. He said Trainium 3 appears to benefit from a stronger interconnect between chips, which matters because large AI models need data to move quickly across training hardware.
Moorhead said Nvidia may remain the dominant player in AI training for a while, but he expects more competition in the next few years. Amazon’s work, he said, "shows that Nvidia is not the only game in town for training."
Amazon’s AI pitch is built for companies
Amazon is the world’s biggest cloud computing provider, but it has not had a ChatGPT-type product to showcase its AI ambitions. Instead, the company is positioning AWS as a platform where other firms can build, train, manage, and deploy their own AI programs.
This year, Amazon has poured $8 billion into Anthropic. It has also expanded Bedrock, an AWS platform designed to help companies use and manage generative AI.
Steven Dickens, CEO and principal analyst at HyperFRAME Research, said the breadth of AWS could become a meaningful advantage as Amazon sells generative AI capabilities to customers. He also said Amazon’s own chips can help make the AI software it sells more affordable.
Dickens described custom silicon as essential for hyperscalers, meaning cloud providers that supply the hardware needed to build the largest and most capable AI systems. He also noted that Amazon has been developing its custom silicon for longer than competitors.
New tools target cost, control, and reliability
Garman told WIRED that many AWS customers are now moving beyond demos and proof of concepts into commercially viable products and services. He also said many customers care less about pushing the frontier of generative AI than about making the technology cheaper and more reliable.
Amazon announced several tools aimed at that practical problem:
- Model Distillation, a service that can create a smaller model that is faster and less expensive to run while retaining similar capabilities to a larger one.
- Bedrock Agents, a system for creating and managing AI agents that automate tasks such as customer support, order processing, and analytics.
- Automated Reasoning, a verification tool designed to help determine whether a chatbot’s output is correct.
Model Distillation works by using a more advanced model to help train a smaller model on a narrower set of tasks. Garman used the example of an insurance company feeding questions into a larger model and then using the results to train a smaller model to specialize in those topics.
Bedrock Agents includes a master agent that can manage a team of AI agents, produce reports on how they function, and coordinate changes. The aim is to help companies supervise many automated systems rather than treating each AI agent as a separate project.
Verification may be the most practical piece
Garman said companies are especially interested in tools that help ensure chatbot outputs are accurate. Large language models can hallucinate, and existing methods for controlling their answers are imperfect.
Automated Reasoning relies on logical reasoning to evaluate a model’s output. To use it, a company must translate its data and policies into a format that allows logical analysis.
Bryon Cook, a distinguished scientist at AWS and vice president of the company’s Autonomous Reasoning Group, explained the approach this way: "We take the natural language, we translate it into logic, we prove or disprove the statement, and then we can provide an argument as to why the statement is true or not."
Cook said this kind of formal reasoning has been used for decades in areas such as chip design and cryptography. He said it could support chatbots that handle airline ticket refunds or provide human resources information without getting facts wrong.
He also said companies can combine multiple Automated Reasoning systems into more sophisticated applications, including services that use autonomous agents. His conclusion was direct: "Reasoning will become a very important thing."