AI text detection is often discussed as if it were a hunt for obvious wording. Max Spero, CEO of AI text detector Pangram, describes a more complicated picture: the strongest clues may sit in the way a document is built, not only in the phrases that appear on the page.
In an interview published on AI Policy Perspectives, Spero said Pangram’s deep-learning classifier is not fully transparent even to its own creators. The result is a detector that can point to clues, while still relying on patterns that are difficult to explain in plain language.
The detector is powerful, but not fully interpretable
Spero describes Pangram’s classifier as a black box. His explanation is direct: “We don't have a ton of interpretability into why it makes the predictions that it does,” he said.
That matters because the tool is not simply matching a fixed list of phrases. According to the source article, Pangram can surface suspicious phrases as clues, but the model also looks for structural patterns that a language model leaves behind while organizing a document.
In practical terms, the visible clue and the deeper reason may not be the same thing. A phrase might help a reader see why a passage was flagged, but the classifier may be responding to a broader shape in the text. Spero’s comments suggest that Pangram can identify signals without always being able to translate them into a simple rule.
LLMs may be clean writers, but that is not the whole signal
Spero does not frame the issue as poor grammar or weak logic. In fact, he argues that language models “might be” better than average humans at grammar and logic.
The problem he highlights is uniformity. A language model can produce polished writing, but its range of arguments may still be narrower than human writing. That narrowness becomes a signal when the same types of reasoning appear again and again.
This distinction is important. If a detector were only looking for mistakes, stronger language models could become harder to identify simply by writing more cleanly. Pangram’s approach, as described by Spero, points in another direction: the recurring structure of an LLM-generated document may remain detectable even when the surface prose looks competent.
The repeated-argument problem
Spero’s clearest example is about argument variety. Ask an LLM for 100 arguments on a topic, he says, and the responses will cluster in a narrow band. By contrast, “the space of human arguments is going to be very diverse.”
That comparison gets to the heart of Pangram’s stated detection logic. The issue is not that a single AI-written paragraph must contain an obvious marker. It is that language models may tend to organize ideas in similar ways across many outputs.
For readers, this helps explain why AI detection can feel different from ordinary editing. An editor might notice repetition, vague phrasing, or formulaic transitions. Pangram’s classifier is described as looking at something related but deeper: patterns left by the model’s method of arranging a document.
- Suspicious phrases can appear as visible clues.
- Structural patterns may carry more weight in the prediction.
- Uniform arguments can make LLM writing less varied than human writing.
What this says about fooling an AI text detector
The source article’s takeaway is simple: anyone trying to fool Pangram would need better arguments. Based on Spero’s comments, changing a few words may not be enough if the underlying document still follows the same narrow argumentative path.
That does not mean Pangram’s decisions are fully explainable. Spero’s own description makes clear that the classifier has limited interpretability. The company can surface clues, but it does not fully understand every pattern the model uses when it decides that a text looks suspicious.
This creates a useful tension. Pangram is presented as a detector that can identify structural traces left by language models, while also being a system whose internal reasoning is not completely open. Its strength, in Spero’s account, comes from recognizing patterns that are difficult for humans to list neatly.
The larger takeaway for AI writing
Spero’s comments shift the focus from whether an AI sentence sounds human to whether an AI document thinks in a varied way. Grammar and logic may be strong, but sameness can still stand out.
That is the central point for anyone following AI text detection. The future of detecting LLM writing may depend less on spotting awkward wording and more on recognizing repeated forms of argument. Pangram’s CEO is effectively saying that the tell is not just what the model says, but how consistently it builds the case.
For a tool like Pangram, the challenge is also the explanation. If the classifier is a black box, users may see a prediction and some clues without getting a complete account of why the prediction happened. Spero’s interview makes that limitation part of the story, rather than hiding it.