Google’s AI workload expanded sharply in June, with its systems processing over 980 trillion tokens during the month. That total was more than double the amount in May, according to Google product manager Logan Kilpatrick and DeepMind CEO Demis Hassabis.
The number is a useful signal because tokens are the basic units AI models use to understand prompts and generate responses. A token is a short text chunk, and every AI interaction can require many of them as the model reads, reasons and replies.
What the token jump shows
The headline figure is large on its own: over 980 trillion tokens processed in June. But the comparison with May is what makes the change stand out. Google’s AI systems did not merely grow by a small margin; they processed more than double the previous month’s amount.
That increase may point to higher use of Google’s AI tools. More users, more prompts or more generated responses would all push token volume upward. But the source also identifies another possible driver: more use of so-called reasoning models.
Reasoning models can process many more tokens while working toward more accurate responses. That means total token volume can rise even when the visible user interaction does not look dramatically larger. A single task may involve more internal processing than earlier systems required.
Why reasoning models change the meaning of usage
In older usage metrics, growth is often interpreted as a straightforward sign of adoption. With AI systems, token counts are more complicated. They can reflect both how often people use a system and how much computation each response requires.
The source points to Gemini Flash 2.5 as an example of the shift. This kind of model is designed to process more information in order to produce more accurate answers. As a result, a rising token count may show that Google is handling not only more AI activity, but also heavier AI activity.
That distinction matters. If users ask similar questions but the underlying model now performs more reasoning, token volume can climb quickly. The user may see a better answer, while the system has done far more work behind the scenes.
The cost signal behind Gemini Flash 2.5
Artificial Analysis reports that Flash 2.5 uses about 17 times more tokens than its previous version. The same report says it is 150 times more expensive for reasoning tasks.
Those two figures explain why the June number is more than a scale milestone. Token growth can also be a cost signal. If more tasks are handled by models that consume many more tokens, the economics of serving AI responses become more demanding.
The source does not say how much of Google’s June total came from Gemini Flash 2.5 specifically. It also does not separate user growth from model intensity. But it does show why both forces should be considered together when interpreting the rise.
- More usage can increase the number of prompts and responses.
- More reasoning can increase the number of tokens used per task.
- More expensive reasoning tasks can make the same category of interaction cost much more to run.
What to watch next
The June total suggests that Google’s AI systems are moving into a phase where raw interaction counts are not enough to understand scale. Tokens offer a closer view of how much work AI models are actually doing.
If reasoning models become a larger share of use, token totals may keep rising faster than ordinary user-facing activity would suggest. That does not automatically mean every increase is caused by new users. It may also mean that each response requires more internal processing.
For Google, the figure shows the scale of AI demand across its systems. For the broader AI market, it highlights a central tension: models that aim for more accurate responses may also require far more tokens and higher reasoning-task costs.
The June number is therefore not just a usage milestone. It is a glimpse into how AI workloads are changing as reasoning models become more prominent, and why token volume is becoming one of the clearest ways to track that change.