The Decoder June 15, 2025 TERMINATOR

Inside Mechanize's Bid to Automate Computer Work With AI Agents

San Francisco-based startup Mechanize is building simulated digital offices to train AI agents through reinforcement learning. Its first target is software development, but its broader ambition is to automate nearly every responsibility a human can carry out at a computer.

WTF Index TERMINATOR

◄ Terminator 3 Idiocracy 1 ►

The story centers on training autonomous AI agents to take over broad computer-based work, raising power and control concerns more than degradation-of-quality concerns.

Inside Mechanize's Bid to Automate Computer Work With AI Agents

Mechanize is aiming at one of the largest questions in AI: whether digital work can move from human-led assistance to full automation. The San Francisco-based startup is building virtual workplaces where AI agents can practice office tasks, receive feedback, and improve through repetition.

The company’s first focus is software development. But its stated ambition is broader than coding help: Mechanize wants AI systems that can eventually take over computer-based work across planning, communication, execution, and repair.

A startup built around full automation

Mechanize was co-founded by Tamay Besiroglu, Ege Erdil, and Matthew Barnett, all formerly of the research group Epoch AI. Besiroglu describes the mission directly: "Our goal is to fully automate work."

That goal puts the company beyond the usual language of productivity tools. Mechanize is not only trying to make workers faster or give them better assistants. It is trying to create AI agents that can carry out the work itself.

The founders have described success in similarly expansive terms: "We’ll only truly know we’ve succeeded once we’ve created A.I. systems capable of taking on nearly every responsibility a human could carry out at a computer."

For now, that remains a long-range target. Barnett estimates it could take 10 to 20 years. Besiroglu and Erdil put the timeline at 20 to 30 years. Those estimates matter because they place Mechanize’s project closer to a multi-decade infrastructure bet than a near-term office software launch.

Why simulated digital offices matter

Mechanize’s core idea is to train AI agents inside environments that resemble real computer work. These simulated digital offices include tools such as email inboxes, Slack, code editors, browsers, and more.

The method is based on reinforcement learning. Agents try to complete tasks, receive rewards when they succeed, receive penalties when they fail, and keep improving through repeated attempts. Besiroglu described the setup to the New York Times as: "It’s effectively like creating a very boring video game."

The point of the simulation is not entertainment. It is controlled practice. A digital office can present tasks again and again, vary the situation, and give the system feedback about what worked. Mechanize believes that richer environments will be necessary if AI agents are to learn the habits of real digital work.

In this view, an effective AI worker must do more than produce an answer. It must manage context, respond to changing information, coordinate with others, and recover from mistakes. That is why the company is focused on environments that mirror actual workflows rather than narrow tests.

Reinforcement learning and the "bitter lesson"

Mechanize’s founders connect their approach to the "bitter lesson" of AI research. The source article describes the idea this way: when hand-designed algorithms compete with approaches driven by data and compute, the data- and compute-heavy methods tend to win at scale.

For Mechanize, that implies that the next step will not come mainly from hand-crafting smarter office logic. It will come from giving AI agents large amounts of experience inside complex simulated environments.

The company’s view also aligns with thinking from Sutton and David Silver, who have outlined a path toward agents that learn by doing rather than only consuming human-written data. In that frame, progress depends on a continuous stream of experience, feedback, and adaptation.

Mechanize sees human demonstration data and reinforcement learning as complementary. Demonstrations can show agents what competent work looks like. Reinforcement learning can then push them to practice, adjust, and improve inside digital settings that look more like the tools people use every day.

Software development is the first test

Mechanize is starting with software development because it is both structured and difficult. Coding work can be split into discrete tasks, and many tools already assist with parts of the process. At the same time, software engineering still requires judgment, coordination, and long-term understanding.

Today’s AI systems can handle areas such as code completion and testing. The source article notes that architecture decisions, team coordination, and long-term maintenance remain out of reach. Those gaps explain why software development is a useful proving ground for agentic AI.

The company wants to build toward a "drop-in remote worker" that can operate inside digital teams. That means more than writing code in isolation. It implies an agent that can plan, delegate, fix mistakes, understand context, and move work forward in a way that resembles a human colleague.

Mechanize also argues that current reinforcement learning environments are not realistic enough. They lack internet access, multi-agent collaboration, and realistic software tools. Without those ingredients, agents struggle to develop the generalization skills needed for broader computer work.

The stakes of automating computer work

Mechanize’s stated endpoint raises a social question that the company has not fully answered. The team says it envisions a future of radical abundance and supports ideas like universal basic income for workers displaced by AI. But the source article says there is no concrete transition plan.

Barnett argues that the mission is ethically justified if society becomes wealthier overall, even when weighed against job losses. That argument places the company’s ambition inside a familiar tension around AI automation: the promise of greater productivity against the risk of disrupting existing work.

Mechanize is also entering a competitive field. Major AI labs have been working on data for reinforcement learning since the earliest reasoning models like o1, including raw logs, verifiers, and fully simulated workspaces. As AI tasks become more complex, the training environments must become more capable too.

The company’s bet is clear. If AI agents are going to automate computer work, they will need places to practice that look much more like the real digital world. Mechanize is building those places first, with software development as the opening challenge and full office automation as the larger goal.