Large language models can produce polished writing and code, but the source article describes a core limitation: they do not have even a basic ability to learn from experience. Researchers at Massachusetts Institute of Technology (MIT) are exploring a way around that limit with Self Adapting Language Models (SEAL), a scheme designed to let an AI model keep improving after its initial training.
The result is not an endlessly self-improving system. It is, however, a meaningful step toward AI that can absorb useful new information, adjust its own parameters, and potentially become more responsive to a user’s interests and preferences.
What SEAL changes about large language models
Most modern LLMs can use information during a conversation, but that does not mean the model itself has learned something for the long term. Adam Zweiger, an MIT undergraduate researcher involved with building SEAL, makes that distinction clear: newer models may “reason” toward better answers during inference, but the model does not necessarily carry that benefit forward.
SEAL takes a different path. Instead of only producing an answer, the model creates material that can be used to update itself. The system has an LLM generate its own synthetic training data and an update procedure based on the input it receives.
Jyothish Pari, a PhD student at MIT involved with developing SEAL, says the project began with a question about whether tokens could trigger a meaningful model update. In the source article, tokens are described as units of text fed to LLMs and generated by them. The broader idea was to test whether a model’s own output could become part of its learning process.
How the self-updating loop works
The SEAL process begins with input that contains potentially useful information. In one example from the source, the model was given a statement about the challenges faced by the Apollo space program. It then generated new passages that tried to explain the implications of that statement.
The researchers compared this behavior to the way a human student writes and reviews notes to help learning. The comparison matters because SEAL is not just retrieving a fact. It is using generated text as a bridge between new information and a change inside the model.
After generating the data, the system updates the model. The updated model is then tested on how well it answers a set of questions. That result becomes a reinforcement learning signal, helping steer the model toward updates that improve its abilities and support continued learning.
In practical terms, SEAL connects several steps into one loop:
- The model receives useful new information.
- It generates synthetic training data from that information.
- It applies an update to its own weights or parameters.
- Its new performance is tested against questions.
- A reinforcement learning signal guides future updates.
That loop is the heart of the approach. It gives the model a mechanism for turning new input into longer-lasting internal change, rather than treating every interaction as something that fades once the session ends.
Where MIT tested the approach
The MIT researchers tested SEAL on small and medium-size versions of two open source models: Meta’s Llama and Alibaba’s Qwen. According to the source, they say the approach ought to work for much larger frontier models too.
The tests covered text as well as ARC, a benchmark used to gauge an AI model’s ability to solve abstract reasoning problems. In both settings, the researchers observed that SEAL allowed the models to keep learning beyond their initial training.
That is the central promise of the work. If an LLM can continue to update itself in a useful way, it could become better at incorporating new information over time. It could also support more personalized AI tools, because the system may be able to learn from signals about a user’s interests and preferences.
Pulkit Agrawal, a professor at MIT who oversaw the work, connects SEAL to a larger problem in AI: how a model can determine for itself what it should try to learn. He also frames the motivation in direct terms: “LLMs are powerful but we don’t want their knowledge to stop,” he says.
The limits are still significant
SEAL is not presented as a finished solution. The source article is explicit that it is not yet a way for AI to improve indefinitely. One major issue is catastrophic forgetting, where learning new information can cause older knowledge to disappear.
Agrawal notes that the LLMs tested with SEAL suffer from this effect. The source also says this may point to a fundamental difference between artificial neural networks and biological ones.
There is also a compute problem. Pari and Zweiger note that SEAL is computationally intensive. They also say it remains unclear how to schedule new periods of learning most effectively.
One idea mentioned by Zweiger is that LLMs could have periods of “sleep” in which new information is consolidated. The point is not that this is already solved, but that continual learning may require a rhythm rather than constant updating.
Why continual learning matters
The importance of SEAL is not only that it improves model performance in tests. It points toward a different relationship between AI systems and new information. A model that can keep learning would not be limited to what it absorbed before deployment, and it could become more capable of adapting to what matters in a particular setting.
That direction has obvious appeal for chatbots and other AI tools. If a model can update itself from useful interactions, it may become better aligned with a user’s preferences without relying only on short-term context.
For now, the work remains a research path with clear constraints. SEAL has shown continued learning on text and ARC using versions of Llama and Qwen, but catastrophic forgetting, compute intensity, and scheduling remain unresolved. Even so, the approach gives AI researchers a concrete way to study how language models might move beyond static training and toward systems whose knowledge does not simply stop.