The Decoder January 14, 2026 NEUTRAL

Why MedGemma 1.5 pushes medical AI into 3D scans

Google has introduced MedGemma 1.5 4B, an open-source medical AI model that can process three-dimensional CT and MRI data as well as histopathology slides. The release also includes MedASR for medical dictation, but Google says both models are developer starting points that require validation, customization and approval for direct patient care.

Google is widening the scope of open-source medical AI with MedGemma 1.5, an updated model designed to work with more complex clinical data than its predecessor. The key change is support for three-dimensional CT and MRI images, moving beyond the earlier focus on 2D inputs such as X-rays and skin images.

The release also includes MedASR, a speech recognition model built for medical dictation. Together, the models show how Google is positioning medical AI around both image interpretation and clinical workflow, while still warning that these systems are not finished products for diagnosis or treatment.

MedGemma 1.5 moves from flat images to 3D medical data

MedGemma 1.5 4B is an updated version of Google's open-source medical image interpretation model. According to the source, Google describes it as the first publicly available language model able to interpret three-dimensional CT and MRI images.

That matters because CT and MRI data can be volumetric. Instead of requiring developers to analyze scan slices one at a time, MedGemma 1.5 can take in entire CT scan volumes at once. The same broader-view approach applies to histopathology, where the model can evaluate multiple sections of a tissue sample together.

The practical implication is straightforward: when a model sees related sections in context, it may be able to identify relationships that are harder to notice in isolated images. Google says MedGemma 1.5 can handle this type of 3D medical data while keeping its ability to understand standard images and text.

The original MedGemma, launched last year, had already seen millions of downloads, according to Google, and led to hundreds of community variants on Hugging Face. MedGemma 1.5 builds on that developer interest by expanding the kinds of medical inputs the model can process.

Benchmark gains show progress, not a finished clinical system

Google's internal benchmarks show improvement over the previous version. CT classification accuracy rose three percentage points to 61 percent. MRI classification increased 14 points to nearly 65 percent.

The model also improved on text-based medical tasks. On the MedQA medical reasoning benchmark, MedGemma 1.5 4B reached 69 percent, compared with 64 percent for the earlier version. For extracting information from electronic patient records, accuracy rose from 68 percent to 90 percent.

Those results point to a model that is becoming more useful across image and text workflows. But Google also says the technology remains early and incomplete. Developers may improve results by fine-tuning the model on their own specific datasets, but that does not turn it into a ready-made clinical product.

This distinction is central to the release. MedGemma 1.5 is being offered as a foundation for developers, not as a tool that should be placed directly into patient diagnosis or treatment decisions without further work.

MedASR targets medical dictation and voice interfaces

Alongside MedGemma 1.5, Google introduced MedASR, a speech recognition model trained on medical vocabulary. The model is designed for medical dictation, where general speech recognition systems can struggle with specialized terms and clinical phrasing.

According to Google, MedASR outperforms OpenAI's Whisper large-v3 model in the tested medical settings. It produced 58 percent fewer errors on X-ray dictations and 82 percent fewer errors on general medical dictations.

MedASR has two roles in Google's medical AI stack:

It can function as a transcription tool for clinical speech.
It can act as a voice interface for MedGemma.

That second role could allow doctors to interact with AI systems through speech rather than typing. The source frames this as a potential workflow shift, especially where dictation and image interpretation already sit close together in clinical practice.

Early users are testing practical medical AI workflows

The source identifies early adopters already testing the technology. Malaysian firm Qmed Asia uses MedGemma for a conversational interface tied to clinical treatment guidelines. Taiwan's National Health Insurance Administration has used the model to analyze over 30,000 pathology reports related to lung cancer surgeries.

These examples show the range of possible applications. One use case centers on retrieving or navigating clinical guidance through conversation. The other applies the model to large-scale analysis of pathology reports.

Both MedGemma 1.5 and MedASR are free for research and commercial use through Hugging Face and Google Cloud Vertex AI. That availability is important for developers who want to experiment with medical AI without starting from a closed system.

Still, Google stresses that validation and customization are required. The models' outputs should not be used for direct diagnosis or treatment. In other words, availability does not remove the need for clinical testing, oversight or responsibility.

Open source does not remove regulatory limits

MedGemma is open source, but its use is still constrained by Google's "Health AI Developer Foundations Terms of Use". The source says this license adds restrictions beyond the Apache 2.0 agreement used for the source code.

The most important limit concerns direct patient care. Using the model weights for direct patient diagnosis or treatment requires medical device approval from relevant authorities. Developers who build on MedGemma must also pass these restrictions on to third parties.

Google says it provides no medical advice through the models and assumes no liability. It also claims no copyright over the output, but users remain solely responsible for how that output is used.

The release arrives as medical AI competition intensifies. OpenAI recently acquired Torch for approximately $100 million to build a "medical memory for AI", launched a ChatGPT Health feature and introduced an AI service for healthcare providers. Anthropic has also entered the sector with Claude for Healthcare, a HIPAA-compliant solution that can access US databases such as Medicare and PubMed.

The broader signal is clear: major AI labs see healthcare as a major market. For developers, MedGemma 1.5 adds a new open-source foundation for 3D medical imaging and clinical text work. For patient care, however, the boundary remains firm: these tools need validation, customization and regulatory approval before they can be used for direct diagnosis or treatment.