New research from Goodfire.ai points to a sharper divide inside AI neural networks than many outside the field might expect: the systems may store memorized content and handle some logical tasks through different internal pathways.
The finding matters because modern AI language models are built from training data, where two abilities emerge side by side. One is memorization, such as repeating exact text the model has seen before. The other is what researchers often call reasoning, meaning the model applies learned patterns or general principles to new inputs.
What the researchers found
In a preprint paper released in late October, Goodfire.ai researchers described a method for separating memorization from problem-solving behavior in neural networks. When they removed memorization pathways, models lost 97 percent of their ability to recite training data verbatim, while keeping nearly all their logical reasoning ability.
The split appeared in the model's weight components, which are the mathematical values that process information. At layer 22 in Allen Institute for AI's OLMo-7B language model, the researchers ranked weight components from high to low using a measure called curvature.
The pattern was clear in that ranking. The bottom 50 percent of weight components showed 23 percent higher activation on memorized data. The top 10 percent showed 26 percent higher activation on general, non-memorized text.
That means the parts most associated with memorization clustered in one area of the ranking, while components linked to broader problem-solving clustered elsewhere. This allowed the researchers to remove low-ranked components tied to memorization while preserving higher-ranked components used for reasoning tasks.
Why arithmetic is the surprise
The most striking result involves arithmetic. The researchers found that basic mathematical operations appeared to share neural pathways with memorization rather than with logical reasoning.
After removing memorization circuits, mathematical performance fell to 66 percent. Logical tasks, by contrast, stayed nearly untouched.
This suggests that current language models may treat arithmetic less like a live calculation and more like recalled content. In the source's terms, a model may handle "2+2=4" more like a memorized fact than a logical operation.
That does not mean all reasoning disappeared. The logical reasoning that remained included tasks such as evaluating true/false statements and following if-then rules. The source also notes that this form of reasoning is not the same as deeper mathematical reasoning needed for proofs or novel problem-solving.
How curvature separates memory from logic
To explain the technique, the researchers used the idea of a loss landscape. Loss measures how many mistakes an AI model makes, while the landscape describes how that error rate changes as the model's internal weights are adjusted.
During training, AI models use gradient descent to adjust those weights toward settings that produce fewer mistakes. The researchers examined the curvature of this landscape, meaning how sensitive the model's performance is to small changes in different weights.
High curvature means small changes can create large effects. Low curvature means changes have less impact. The researchers used K-FAC, or Kronecker-Factored Approximate Curvature, to analyze those patterns.
The source says individual memorized facts create sharp spikes in the landscape, but those spikes point in different directions. When averaged together, they look flat. Reasoning abilities, by contrast, rely on shared mechanisms used by many inputs, so they retain more consistent curvature.
Directions that implement shared mechanisms used by many inputs add coherently and remain high-curvature on average
The researchers contrasted that with memorization, which uses "idiosyncratic sharp directions associated with specific examples" that appear flat when averaged across data.
Tests across model types
The team tested the technique on multiple AI systems. They primarily used Allen Institute's OLMo-2 family of open language models, including the 7 billion- and 1 billion-parameter versions, because their training data is openly accessible.
For vision models, they trained custom 86 million-parameter Vision Transformers, or ViT-Base models, on ImageNet with intentionally mislabeled data. That setup gave them controlled memorization to study. They also compared their approach with existing memorization removal methods such as BalancedSubnet.
When the team selectively removed low-curvature weight components, memorized content dropped to 3.4 percent recall from nearly 100 percent. Logical reasoning tasks remained at 95 to 106 percent of baseline performance.
Those logic tasks included Boolean expression evaluation, logical deduction puzzles involving relationships such as "if A is taller than B," object tracking through multiple swaps, and benchmarks including BoolQ, Winogrande, and OpenBookQA.
Some tasks sat between the two extremes. Mathematical operations and closed-book fact retrieval shared pathways with memorization, dropping to 66 to 86 percent performance after editing.
What this could mean for AI safety and privacy
If the technique develops further, it could eventually help AI companies remove specific memorized material from models without destroying useful capabilities. The source names copyrighted content, private information, and harmful memorized text as possible targets.
That possibility is important because AI neural networks store information in distributed ways that are still not fully understood. The researchers caution that their method "cannot guarantee complete elimination of sensitive information." For now, the result is best understood as an early step in a new research direction.
The larger implication is practical. If memorization and reasoning can be mapped and edited separately, AI developers may gain more precise tools for controlling what models retain from training data and what capabilities they keep after editing.