The Decoder July 4, 2026 NEUTRAL

Why Leanstral 1.5 matters for formal math and code checks

Mistral AI released Leanstral 1.5, a free open-source model under the Apache 2.0 license for formal verification in Lean 4. It reports strong formal math benchmark results and found five previously unknown bugs while scanning 57 open-source repositories.

WTF Index NEUTRAL

◄ Terminator 1 Idiocracy 0 ►

This is mainly a technical open-source model release for formal verification, with mild safety-relevant power gains but no clear harmful or degrading societal angle.

Why Leanstral 1.5 matters for formal math and code checks

Mistral AI has released Leanstral 1.5, a free open-source model built for formal verification in the Lean 4 programming language. The model is available under the Apache 2.0 license and is offered through Hugging Face and a free API.

The release is aimed at a demanding part of AI-assisted reasoning: checking mathematical proofs and software correctness in a formal system. Lean 4 is designed for that kind of work, and Mistral AI is positioning Leanstral 1.5 as a model that can operate in both math-heavy and code-focused settings.

A model built around Lean 4

Lean 4 is designed to formally verify mathematical proofs and software correctness. That makes it different from ordinary programming or plain text generation, because the goal is not only to produce an answer that looks plausible. The output has to fit into a system where statements can be checked formally.

Leanstral 1.5 was trained mainly for math, according to Mistral AI. The training process involved mid-training, supervised fine-tuning, and reinforcement learning. Those steps are part of how the company prepared the model for formal verification tasks in Lean 4.

The important point is the target use case. Formal verification is about reducing ambiguity. In math, that means working with proofs in a way that can be checked. In software, it means examining correctness properties rather than relying only on manual review or informal reasoning.

The benchmark results are centered on formal math

Mistral AI reports that Leanstral 1.5 reaches 100 percent on miniF2F. The source describes miniF2F as a formal math benchmark that covers problems from high school level up to math olympiad difficulty.

The model also posts a strong result on PutnamBench. PutnamBench includes 672 problems from the Putnam math competition, and Leanstral 1.5 solves 587 of them.

For algebra, Mistral AI reports top results on FATE-H and FATE-X. These benchmarks test master's and doctoral-level tasks in areas such as group theory and ring theory. Leanstral 1.5 scores 87 and 34 percent on those two benchmarks.

Taken together, the benchmark claims describe a model with a clear specialization. It is not presented as a general chatbot in the source article. It is presented as an open-source model focused on formal math and formal verification, with results across several difficulty levels and benchmark types.

Code verification is part of the story

Although Leanstral 1.5 was trained mainly for math, Mistral AI says it also performs well at code verification. That matters because the same formal verification setting can apply to software correctness, not just mathematical proof work.

In a hands-on test, the model scanned 57 open-source repositories. During that scan, it found five previously unknown bugs. One example named in the source is an overflow bug in the Rust library varinteger.

That result is narrower than the math benchmark claims, but it is still significant within the source's framing. It shows the model being applied to real repositories, not only benchmark problems. The source does not give broader details about the five bugs, so the safest conclusion is limited: Mistral AI says the model found previously unknown issues during that test.

Why the open-source release matters

Leanstral 1.5 is described as free and open-source under the Apache 2.0 license. It is available through Hugging Face and a free API, which gives users more than one route to access it.

For researchers and developers working with Lean 4, availability is an important part of the release. A model focused on formal verification is most useful when it can be tested against real proof and code workflows. The source article points to both hosted access and Hugging Face availability, without adding further deployment details.

The release also connects two related uses of formal systems:

Mathematical proof work: Lean 4 is used to formally verify mathematical proofs, and Leanstral 1.5 reports strong results across formal math benchmarks.
Software correctness: Lean 4 is also designed for software correctness, and Mistral AI says the model found real bugs in open-source repositories.

The common thread is verification. Leanstral 1.5 is not just generating explanations about math or code. Based on the source article, its purpose is to work in a formal environment where correctness can be checked more rigorously.

The practical takeaway

Leanstral 1.5 gives Mistral AI an open-source model aimed at a specialized technical audience. Its reported results on miniF2F, PutnamBench, FATE-H, and FATE-X make the math side of the release the main headline. Its code-verification test adds a second use case by showing the model finding previously unknown bugs in open-source repositories.

The source does not provide every implementation detail, and it does not describe the full scope of the code-verification test. What it does show is a model built for Lean 4, released under an open license, and evaluated on formal math benchmarks as well as a hands-on software scan.

For anyone following AI in formal verification, Leanstral 1.5 is notable because it combines open-source access, Lean 4 specialization, benchmark performance, and an example of real code bug discovery in one release.