A research team has shown how a ChatGPT-like AI can move beyond following a rigid surgical script. Working with a DaVinci robot, John Hopkins University researchers trained a system called SRT-H to perform the steps of gallbladder removal on pig organs.
The result is not a human-ready operating room robot. It is an experimental system tested on porcine gallbladder and liver samples from pig cadavers. But it shows why autonomous surgery is shifting from pre-programmed motion toward robots that can learn from demonstrations, interpret camera feeds, and respond to plain-language feedback.
From remote control to learned surgery
DaVinci surgical robots, introduced by Intuitive Surgical in the late 1990s, became important teleoperation tools. In that model, expert surgeons remotely control robotic arms and surgical instruments while watching video from built-in cameras and endoscopes.
The new work changes the role of the machine. Instead of using a human surgeon to drive every motion in real time, researchers put an AI system in charge of the DaVinci robot and taught it a surgical task through examples.
Earlier attempts at autonomous robotic surgery often depended on pre-programmed actions. Ji Woong Kim compared that approach to factory robots, saying, "The program told the robot exactly how to move and what to do. It worked like in these Kuka robotic arms, welding cars on factory floors."
Kim and colleagues are building on earlier work led by Axel Krieger, an assistant professor of mechanical engineering at John Hopkins University. That team built STAR, the Smart Tissue Autonomous Robot, which successfully performed a surgery on a live pig in 2022. STAR could adjust a plan based on camera input, but it still depended on specially marked tissues and a predetermined plan.
How SRT-H works
The newer system is called SRT-H, short for Surgical Robot Transformer. Kim described the current work as more flexible because the AI learns from demonstrations rather than simply executing a fixed series of movements.
The researchers also made a major hardware choice. Instead of using a custom robot like STAR, they used the DaVinci platform, which has become a de facto industry standard in teleoperation surgeries. The source article notes that over 10,000 units are already deployed in hospitals worldwide.
SRT-H uses two transformer models, the same architecture that powers ChatGPT. One model handles higher-level planning, deciding how the procedure should proceed. The other turns those instructions into specific robotic-arm trajectories.
That split matters because surgery is not just a list of motions. A system must understand what step it is performing, where the tissue and tools are, and how to translate a goal into precise physical movement.
The task: cholecystectomy
The procedure chosen for the experiment was cholecystectomy, gallbladder removal. The source describes it as a procedure routinely performed in US hospitals, roughly 700,000 times a year.
Kim explained the surgical objective this way: "The objective is to remove the tubes connecting the gallbladder to other organs without causing the internal fluids to flow out." In the task used for training, the surgeon must place three clips on the cystic duct, cut it, and then clip and cut the cystic artery in a similar way.
The team divided the procedure into 17 steps. To create training data, a trained research assistant repeatedly operated a DaVinci robot on porcine gallbladder and liver samples from pig cadavers.
The training data combined several kinds of information:
- over 17 hours of video from the DaVinci endoscope;
- camera footage from the robot's arms;
- kinematics data showing the exact motions of the robotic arms;
- natural language annotations.
That combination let the system connect what it saw, what it was told, and how the robot physically moved. In practical terms, the AI was not just watching surgery; it was learning the relationship between visual context, verbal guidance, and instrument motion.
What the robot achieved
After training, SRT-H performed cholecystectomy with a 100 percent success rate on samples it had not trained on. The system also handled differences in anatomy between samples, tissue blocking the view, and imperfect imagery.
Another important feature was its ability to accept human feedback in natural language. The source gives examples such as "move your arm a bit to the left" and "put the clip a bit higher." Those instructions resemble the small corrections a mentor might give a trainee during a procedure.
SRT-H could also recover from small mistakes made during training. Compared with an expert human surgeon performing the same procedure, it was equally precise, although a bit slower.
Kim argued that the method could generalize beyond gallbladder removal: "You can take any kind of surgery, not just this one, train the robot in the same way, and it will be able to perform that surgery." That claim is about the training approach described in the study, not a statement that the system is ready for all operations today.
The data problem ahead
The path from pig cadaver samples to live pigs and then potentially to humans depends on training data. According to the source, that data is difficult to obtain.
Intuitive Surgical is apparently willing to release video feed data from DaVinci robots, but it does not release the kinematics data. Kim says that kinematics data is necessary for training the algorithms.
He said, "I know people at Intuitive Surgical headquarters, and I've been talking to them," adding, "I've been begging them to give us the data. They did not agree." According to Kim, the company leadership's explanation was concern that competitors could reverse-engineer the mechanics of the robot.
Kim also sees a possible workaround. He told Ars, "We can start with attaching motion-tracking sensors to manual surgical tools, and get the kinematics data this way." Those recorded movements, made by expert human surgeons using manual tools, could then be recreated by conventional robotic arms like those used in STAR.
The broader direction is even more ambitious. Kim said he is currently at Stanford and involved in a humanoid robotics project focused on a general-purpose model, with the operating room as one possible application.
For now, SRT-H is best understood as a research milestone. It shows that an AI surgical robot can learn a structured procedure from demonstrations and adapt during execution, while also showing that the next stage may depend as much on access to data as on robot hardware or model design.