MIT Tech Review AI March 11, 2025 TERMINATOR

What Manus Reveals About the Future of AI Agents

Manus is drawing attention because it behaves less like a chatbot and more like a working AI agent that can plan, browse and revise. In testing, it was useful for bounded research tasks, but crashes, server overload, paywalls and broad assignments exposed real limits.

WTF Index TERMINATOR

◄ Terminator 2 Idiocracy 1 ►

Manus points toward more autonomous AI agents that can plan and browse, but the story emphasizes early limits rather than clear danger.

What Manus Reveals About the Future of AI Agents

Manus has quickly become one of the most discussed new AI tools because it points toward a different kind of everyday assistant: not just a chatbot that answers, but an agent that can break work into steps, search the web and show its process while it works.

The early picture is mixed. The tool is intuitive, collaborative and promising, but it also struggles with instability, blocked websites, broad research briefs and heavy service demand.

Why Manus Is Getting Attention

Manus was developed by Butterfly Effect, a Wuhan-based startup. It has already moved beyond China’s AI circles and into a wider tech conversation, where figures including Twitter cofounder Jack Dorsey and Hugging Face product lead Victor Mustar have praised its performance.

Some observers have compared it to DeepSeek, but the comparison is imperfect. DeepSeek is described in the source as being based on a single large language model family and built mainly for conversational interaction. Manus, by contrast, claims to be the world’s first general AI agent.

That claim rests on how the system works. Manus uses multiple AI models, including Anthropic's Claude 3.5 Sonnet and fine-tuned versions of Alibaba's open-source Qwen, along with independently operating agents that can act across a range of tasks.

Access remains limited. Under 1% of users on the wait list have received an invite code, while Manus’s Discord channel has more than 186,000 members. That gap between demand and availability has helped feed the sense that Manus is both important and still very early.

How the Experience Works

Manus is designed for a global audience, much like Butterfly Effect’s earlier AI assistant Monica, which was released in 2023. English is the default language, and the interface is described as clean and minimalist.

After entering a valid invite code, users arrive at a page that resembles ChatGPT or DeepSeek. Previous sessions appear in a left-hand column, while the central area contains the chat input box. The company also provides sample tasks, including business strategy development, interactive learning and customized audio meditation sessions.

The distinctive feature is a window called Manus’s Computer. It lets users watch what the agent is doing and step in while the task is underway. That matters because agentic AI work can otherwise feel opaque: a user gives a prompt, waits and only sees the final output.

With Manus, the process is more visible. The tool can ask questions, remember important instructions as “knowledge” and make sessions replayable and shareable. For users trying to steer complex research, that transparency is part of the appeal.

Three Tests Showed Strengths and Limits

The source tested Manus on three assignments: compiling notable reporters covering China tech, searching for two-bedroom property listings in New York City and nominating potential candidates for Innovators Under 35, a list created by MIT Technology Review every year.

For the reporter list, Manus first returned only five names plus five “honorable mentions.” The output was uneven: some journalists had notable work listed while others did not. When asked why, Manus admitted the inconsistency was “partly due to time constraints as I tried to expedite the research process.”

After being pushed for consistency and depth, it produced a list of 30 journalists with current outlet and notable work. It also corrected some employer-status errors when asked to revisit results. The ability to download output as a Word or Excel file made the results easier to edit and share.

Still, the task exposed friction. Manus ran into captcha blocks and paywalls while trying to access journalists’ articles. The user could take over because the working process was visible, but many media sites still blocked the tool for suspicious activity.

For the apartment search, Manus handled a complex brief that included budget, kitchen space, outdoor space, access to downtown Manhattan and a major train station within a seven-minute walk. At first, it treated “some kind of outdoor space” too narrowly and excluded properties without private terrace or balcony access.

After clarification, the tool broadened the list and organized recommendations into tiers with clear bullets. The final result used categories such as “best overall,” “best value” and “luxury option.” The full exchange took less than half an hour, compared with a little over an hour for the journalist list.

The largest test was the Innovators Under 35 nomination task. Manus broke the work into steps, including reviewing past lists, forming a search strategy, compiling names and trying to ensure a diverse selection of candidates from all over the world.

That scope proved difficult. After three hours of searching, Manus had only produced three candidates with full background profiles. When pressed to deliver 50 names, it generated a list, but some academic institutions and fields were heavily overrepresented. A narrower request for five candidates from China produced a stronger five-name list, though the results skewed toward Chinese media darlings.

Where Manus Fits Now

The overall lesson is that Manus appears strongest on analytical tasks that require research across the open internet but have a limited scope. It can behave like a capable assistant: it plans, explains, adapts to feedback and improves when given clearer instructions.

It is less reliable when the assignment is too large, when the information sits behind paywalls or captchas, or when the system is under strain. The reviewer saw frequent crashes, system instability and the message “Due to the current high service load, tasks cannot be created. Please try again in a few minutes.” Manus’s Computer also froze on some pages for long periods.

The source says Manus has a higher failure rate than ChatGPT DeepResearch, a problem the team is addressing according to Manus’s chief scientist, Peak Ji. At the same time, 36Kr reports that Manus’s per-task cost is about $2, one-tenth of DeepResearch’s cost.

If the infrastructure improves, Manus could become useful for individual users, white-collar professionals, independent developers and small teams. Its appeal is not that it is flawless. It is that it makes autonomous AI feel more collaborative, more inspectable and more practical for work that resembles what a skilled human intern might complete in a day.