How Murakkab Could Make AI Agents Faster and Leaner

Researchers from MIT and Microsoft developed Murakkab, a system that helps design and deploy agentic workflows more efficiently. In tests on video Q&A and code generation workflows, it met user requirements while using about 35 percent of the computation, about 27 percent of the energy, and less than 25 percent of the cost required by other methods.

WTF Index NEUTRAL
◄ Terminator 1 Idiocracy 0 ►

Murakkab makes agentic workflows more efficient, but the story is mainly a technical optimization rather than a clear shift toward harm or societal deskilling.

How Murakkab Could Make AI Agents Faster and Leaner

AI agents are becoming more capable, but the systems behind them can also be wasteful. Researchers from MIT and Microsoft have developed Murakkab, an intelligent system designed to make agentic workflows faster, less expensive, and more energy-efficient without forcing developers to manually tune every technical detail.

The work focuses on a practical problem: agentic workflows often combine multiple AI agents, models, and external tools to complete complex tasks. That flexibility can be powerful, but it can also make the systems difficult to configure efficiently.

Why agentic workflows are hard to optimize

An agentic workflow is a system made from several autonomous AI agents that work together with models and tools such as databases or Python programs. These workflows can handle multi-step tasks including data processing, code generation, and video Q&A.

They can also sit behind user-facing applications. A user may only see a simple request and response, while the workflow underneath is selecting tools, running models, processing information, and assembling an answer.

The challenge is that developers usually have to make many technical decisions in advance. They must choose which AI agents, models, and tools to use, decide the order in which they should run, and specify the hardware and resource settings for deployment.

That becomes difficult because the components are often black-box models and diverse tools, each with its own options. Some may come from different companies. If a better AI model becomes available, the developer may need to rebuild or reconfigure the workflow to use it.

Gohar Chaudhry, an electrical engineering and computer science graduate student and lead author of a paper on the system, described the problem directly: “Even if you wanted to do all this manually, it is unlikely that you’ll be able to configure the workflow optimally because the space of possible configurations is so large,” Chaudhry says.

What Murakkab changes

Murakkab, an Urdu word that means a composition of things, is designed to optimize the entire agentic workflow process. Instead of requiring a developer to specify every component and deployment decision, the system starts from a plain-language description of what the application should do.

For example, a developer could describe a video Q&A application that extracts key frames, generates a transcript, and answers user questions about the video. Murakkab can then identify existing models and tools that fit the workflow.

The system also decides which parts of the workflow need to run one after another and which can run in parallel. That matters because some tasks depend on earlier results, while others can happen at the same time to improve speed.

Murakkab also keeps the workflow flexible over time. As Chaudhry explains, “The platform makes configuration decisions dynamically over time, so if a new model or GPU accelerator comes out tomorrow, the developer doesn’t need to worry about that.”

This shifts some of the configuration burden away from developers and toward an automated system that can evaluate models, tools, hardware allocations, and deployment schedules together.

Cloud deployment becomes part of the optimization

The research also addresses what happens when a cloud provider runs the application for customers. A cloud data center may not have enough visibility into a workflow to allocate hardware resources efficiently at the moment a user request arrives.

Murakkab gives the cloud provider a better view across multiple workloads. It can configure workflow components to satisfy user constraints, such as prioritizing accuracy while meeting a latency requirement.

It can also adapt hardware allocations and deployment schedules in real time. The goal is not simply to make one workflow run, but to make it run with fewer wasted resources while still meeting the requirements set by the user.

Chaudhry framed this as an issue that affects both cost and energy use: “Agentic workflows are getting very complicated and quickly becoming the backbone of what cloud providers are doing. Energy usage is a huge concern, so we need to be very careful about how efficient these workflows are. It is very easy to over-allocate resources, wasting energy and money. Enabling a cloud provider to intelligently make these workflows more resource-optimal is a win for everyone involved,” says Chaudhry.

What the tests showed

The researchers tested Murakkab on diverse agentic workflows for video Q&A and code generation. In those tests, the system met user requirements while using only about 35 percent of the computation required by other methods.

It also consumed only about 27 percent as much energy and ran for less than 25 percent of the cost. Those results point to the central promise of Murakkab: better configuration can reduce waste without undermining performance.

The system also allowed users to balance tradeoffs. In one case, Murakkab lowered energy consumption of an agentic workflow by more than an order of magnitude with only about a 2 percent drop in accuracy for the customer.

Another test showed that Murakkab could find an unexpectedly ideal configuration for a model that selects video frames. That configuration improved performance for a video Q&A task, a type of optimization Chaudhry says would be nearly impossible for a developer to do manually.

What comes next

The researchers plan to expand Murakkab to more complex workflows and larger computing clusters. They are also exploring opportunities to optimize new agentic applications.

The paper includes Chaudhry, Adam Belay, an associate professor of EECS and a member of the MIT Computer Science and Artificial Intelligence Laboratory, senior author Ricardo Bianchini, technical fellow and corporate vice president at Microsoft Azure, and others at Microsoft Azure. The paper will be presented at the USENIX Symposium on Operating Systems Design and Implementation.

The research was supported, in part, by the Semiconductor Research Corporation and the U.S. Defense Advanced Research Projects Agency.

The broader implication is straightforward: as AI agents become more common, the way they are assembled and deployed will matter. Murakkab shows that optimizing the workflow itself can be as important as choosing a strong model.