Rabbit r1 gets a wider web agent test on October 1

Rabbit says the r1 will receive a web-based Large Action Model update on October 1. The new agent is meant to handle ordinary website tasks, but the demo also showed why prompt wording, logins, and user trust still matter.

WTF Index TERMINATOR
◄ Terminator 2 Idiocracy 1 ►

A web agent that can act across websites raises mild autonomy and trust risks, though the story is mostly a product update rather than a clear danger case.

Rabbit r1 gets a wider web agent test on October 1

Rabbit is trying to turn the r1 back toward the big idea that made it one of early 2024's most watched gadgets: an AI device that can take action across digital services, not just answer questions.

CEO Jesse Lyu told TechCrunch that the company is preparing a web-based version of its Large Action Model, or LAM, for r1 owners. The company later gave October 1 as the final-ish date for the update.

What Rabbit says is changing

The r1 has already received 16 over-the-air updates, according to Lyu. Those updates have focused on shipping, bug fixes, response times, and smaller features. But the device has still been limited in an important way: it could interact with an LLM or connect to one of seven specific services, including Uber and Spotify.

Lyu acknowledged that the first version did not match the broader vision Rabbit had described. As he put it, "on day one, we set our expectations too high."

The new release is described as the first generic version of the LAM. In this context, generic means it is not built only for one app, one website, or one fixed interface. Rabbit wants the agent to look at websites, understand the available controls, make a plan, and then act through the page.

The version shown to TechCrunch is web-based and based on WebVoyager. Its purpose is to handle ordinary tasks on websites, such as buying concert tickets, registering a website, or playing an online game.

How the web agent works

According to the demo described by TechCrunch, the agent starts by breaking a request into steps. It then looks at what appears on the screen, including buttons, fields, and images. The position or visual style of those elements is not supposed to be the deciding factor; the agent is meant to reason from what it sees and from what it has learned about how websites work.

In one example, the request was to register a new website for a film festival. The agent searched Google for domain registries, selected one, entered film festival into the domain box, and chose "filmfestival2023.com" for $14. The result reflected the lack of extra constraints in the prompt, such as a year or type of festival.

Another test asked the agent to search for and buy an r1. It found its way to eBay, where many were on sale. Lyu then tried again with an added instruction to buy only from the official website, and the agent succeeded.

A third example involved Dictionary.com's daily word game. The agent was able to play, but only after some prompt engineering. Without tighter wording, it found a shortcut by hitting "end game."

Why prompt wording still matters

The demo suggests that Rabbit's web agent can operate more broadly than the r1's earlier service-specific setup. It also shows that the user still has to be careful about instructions.

That matters because ordinary users may not want to learn how to phrase tasks in a precise way. If a request is too open, the agent may technically complete it while missing the user's actual intent. The film festival domain example and the eBay example both point to the same problem: an agent can act, but action is not the same as judgment.

Lyu described the release as a "playground version" and said it is not final. He also said, "the model is smart enough to do the planning, but isn't smart enough to skip steps."

That limitation has practical consequences. The agent would not automatically learn that a user prefers not to buy electronics on eBay. It also would not know that it should scroll down after a search to move past sponsored results.

Logins, data, and the browser question

The agent uses a fresh browser in the cloud, according to Lyu. Rabbit is also working on local versions, including a Chrome extension, which could allow the agent to use existing sessions instead of logging into services from a clean environment.

For now, the agent is not equipped with user credentials. That is a central issue for any web agent that is supposed to carry out real tasks, because many useful actions require a signed-in account. Lyu suggested that a walled-off small language model could privately handle credentials in the future, but the details remain an open question.

Rabbit also is not yet using user data to improve the model. Lyu connected that decision to the lack of an evaluation method for a system like this. Without a clear way to measure the system, it is hard to say quantitatively whether the agent is improving.

A "teach mode" is also in development. The idea is that users would be able to show the system how to perform a specific kind of task.

The bigger bet behind the r1

Rabbit's argument is not just that the r1 can automate a website. The broader pitch is that a third-party AI device can operate across services from the outside, as a person would.

Lyu called the concept "A cross-platform, generic agent system." He said Rabbit plans to start with websites, then move to apps and other interfaces, including Windows, MacOS, and phones.

That is also his answer to the criticism that the r1 could have been an app. Lyu argued that an app would face platform limits from Apple and Google and would not be allowed to become better than Siri or Gemini. He also pointed to the 30% revenue share taken by those platforms.

Rabbit is additionally working on a desktop agent that can interact with apps such as word processors, music players, and browsers. Lyu said that work is still early, but functioning: "You don't even need to input a destination, it just tries to use the computer. As long as there is an interface, it can control it."

The October 1 update does not settle every question around the r1. It does, however, move Rabbit closer to the web agent idea it promoted from the beginning. The key test will be whether r1 owners find the new LAM useful enough in daily life, especially when the system still depends on careful prompting and has unresolved questions around logins, preferences, and reliability.