The Decoder June 22, 2026 TERMINATOR

How Sakana AI's Fugu makes many LLMs act like one model

Sakana AI is launching Fugu, a system that coordinates multiple language models while presenting itself as one model through a single OpenAI-compatible API. The company says Fugu Ultra performs on par with Anthropic's Fable 5 and Mythos Preview in several benchmark areas, while its swappable model pool is meant to reduce reliance on any one provider.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 0 ►

Fugu modestly leans toward more autonomous and powerful AI orchestration, though it is mainly a product launch rather than a clear danger story.

How Sakana AI's Fugu makes many LLMs act like one model

Sakana AI is making a direct bet on orchestration: instead of asking one language model to do everything, Fugu coordinates a pool of models behind one interface. To the user, it behaves like a single model. Inside the system, it can decide whether to answer directly or bring in other LLMs for specialized work.

The Tokyo-based startup is positioning Fugu as both a performance strategy and a resilience strategy. Its message is simple: important AI work should not depend entirely on one provider, one API, or one model family.

What Fugu Is Built To Do

Fugu is itself a language model, but its main role is not limited to generating an answer in isolation. Sakana AI trained it to call other LLMs from an agent pool, including copies of itself. That pool can be changed, which is central to the product's design.

When a user sends a request, Fugu can handle the task alone or assemble a group of specialized models. The process includes selection, delegation, checks, and synthesis, all handled internally. The outside experience is meant to remain simple: users access the system through a single OpenAI-compatible API.

This makes Fugu different from a standard chatbot interface where one model receives a prompt and returns a response. Fugu is closer to a managed workflow in which one model acts as the coordinator and decides how to use other models to reach a stronger final output.

Sakana AI has prior experience with orchestrator-style systems. Its ALE-Agent placed 21st out of 1,000 human experts in a coding competition, and Fugu extends that general idea beyond coding-focused results.

Two Versions For Different Workloads

Sakana AI is launching two variants of the system. The base Fugu model is aimed at low latency and everyday performance. The company describes coding, code review, and chatbot use cases as target areas for this version.

Fugu Ultra is the more powerful variant. It is designed for maximum answer quality on complex, multi-step problems rather than simply fast responses. According to Sakana AI, early users have applied it to AI research, reproducing scientific papers, cybersecurity analysis, and patent and literature searches.

The distinction matters because orchestration can be most useful when a task has several parts. A short answer may not need a team of models. A long-running investigation, review, or research workflow may benefit more from dividing the work, checking intermediate results, and combining outputs.

Teams with privacy or compliance requirements can also exclude specific agents from the pool. That gives organizations some control over which underlying models are used, while still keeping Fugu as the single interface.

How Fugu Ultra Compares In Benchmarks

Sakana AI says Fugu Ultra performs on par with Anthropic's Fable 5 and Mythos Preview across coding, reasoning, science, and agent benchmarks. The comparison is notable because neither Anthropic model is included in Fugu's agent pool, since they are not publicly available.

The company also says the baseline comparison numbers come from the model providers themselves. In Sakana AI's framing, the key point is that Fugu can reach top-tier performance without depending on those unavailable Anthropic models.

Sakana AI also claims Fugu beat Gemini 3.1 Pro, Opus 4.8, and GPT 5.5 in its own tests on automated research, mechanical design, and financial forecasting. Those are areas where the system's multi-step coordination could matter, because the work may involve searching, analyzing, testing, and synthesizing rather than only producing a direct answer.

One software developer said Fugu Ultra catches far more bugs during code review than GPT-5.5. The developer's comparison was blunt: "Where other tools flag about three issues, Fugu surfaced more than twenty."

According to Sakana, Fugu also solves and visualizes a Rubik's Cube faster than the individual models. That example points to the broader claim behind the product: orchestration can create better results than relying on any one model in the pool.

The Vendor Lock-In Argument

Sakana AI is not only selling Fugu as a stronger model experience. It is also presenting the system as protection against single-provider dependence. The company points to export controls on Anthropic's Fable and Mythos models as an example of how access to advanced AI systems can change suddenly because of regulatory shifts or foreign policy decisions.

"For an organization or a nation, relying on a single company’s APIs for critical infrastructure, finance, or governance is a material vulnerability. This risk is no longer a hypothetical possibility, but a reality,"

That line from Sakana AI's announcement explains the strategic pitch. If one provider becomes unavailable, Fugu's swappable model pool can reroute to other models. The system is meant to make model access more flexible and less brittle.

There is an important limit, though. Fugu's real-world performance depends on which models are actually available in its pool. If several leading providers restrict access at the same time, the pool becomes smaller. An orchestrator can improve resilience, but it is not the same thing as full independence from outside model providers.

Another unresolved issue is cost. The source announcement does not address how much orchestration increases token usage and costs. If a task calls several models, checks their outputs, and synthesizes the result, the quality gains may come with heavier usage.

Why Sakana AI Is Betting On AI Ecosystems

About 500 beta users have already tested Fugu in real-world settings, according to Sakana AI. The system appeared strongest on long, multi-step workflows such as automated data research, security analysis, and code reviews.

Sakana AI summarized the beta lesson this way: "The beta made clear that multi-agent orchestration matters most when the task is messy, long-running, and difficult to solve with a single model call."

The technical direction builds on Sakana AI's own research into learned model orchestration, including two papers presented at ICLR 2026 called Trinity and Conductor. It also fits the company's broader interest in natural principles such as swarm behavior, evolution, and collective intelligence.

That vision treats advanced AI less as a contest to build one dominant model and more as a problem of coordinating many capable systems. Fugu is the product version of that argument: a single user-facing API backed by a flexible group of LLM agents.

Both Fugu variants are live now through a single API on the product page and console. Sakana offers subscription plans for daily use and usage-based billing for bigger workloads.