A controversial football security decision in the UK has become a case study in how an AI-generated error can travel from a research step into public policy. West Midlands Police has now acknowledged that Microsoft Copilot was involved in producing a false claim about a match that never happened.
The admission matters because the disputed information formed part of the background to a decision that kept Maccabi Tel Aviv fans away from an Aston Villa match. It also came only after the force had repeatedly denied that AI tools were used.
The decision that started the dispute
In October 2025, Birmingham’s Safety Advisory Group, or SAG, considered whether a football match between Aston Villa and Maccabi Tel Aviv could go ahead safely. West Midlands Police was a key member of the SAG and argued that the fixture could bring a risk of violence in Birmingham.
The wider context was already tense. The source article says the atmosphere was heightened in part by an October 2 terror attack against a synagogue in Manchester, where several people were killed by an Islamic attacker.
Police recommended banning fans from the game. Their case pointed to claims about Maccabi Tel Aviv fans at a recent football match in Amsterdam, including allegations that those fans had been violent.
The match went ahead on November 6 without fans. But the decision did not close the issue. Instead, the ban became politically charged and remained under scrutiny for months.
Claims about Amsterdam began to unravel
The police account of Amsterdam was soon challenged. According to the BBC, police claimed that the Amsterdam football match involved “500-600 Maccabi fans [who] had targeted Muslim communities the night before the Amsterdam fixture, saying there had been ‘serious assaults including throwing random members of the public’ into a river. They also claimed that 5,000 officers were needed to deal with the unrest in Amsterdam, after previously saying that the figure was 1,200.”
Amsterdam police made clear that the West Midlands version of events overstated bad behavior by Maccabi fans. The BBC later obtained a letter from the Dutch inspector general confirming that the claims were inaccurate.
Those disputes were serious on their own. But one smaller detail became especially damaging to the credibility of the report: a claimed match between West Ham and Maccabi Tel Aviv.
The problem was simple. No such match occurred. That meant the report included a completely false example in a list of recent games with Maccabi Tel Aviv fans present.
From Google explanation to Copilot admission
As an inquiry developed, Craig Guildford, chief constable of the West Midlands Police, appeared before Parliament in December 2025 and again in early January 2026. On both occasions, he denied that AI had been used.
In December, Guildford attributed the West Ham error to faulty “social media scraping.” In January, he gave a different explanation, saying the information came from bad searching rather than an AI tool.
On January 6, he told Parliament: “We do not use AI,” and then explained: “On the West Ham side of things and how we gained that information, in producing the report, one of the officers would usually go to… a system, which football officers use all over the country, that has intelligence reports of previous games. They did not find any relevant information within the searches that they made for that. They basically Googled when the last time was. That is how the information came to be.”
That account did not hold. In a January 12 letter, Guildford acknowledged that the false West Ham detail had a different source: “I [recently] became aware that the erroneous result concerning the West Ham v Maccabi Tel Aviv match arose as result of a use of Microsoft Co Pilot.”
Guildford said he had not intended to mislead anyone. He added that “up until Friday afternoon, [I] understood that the West Ham match had only been identified through the use of Google.”
Why the AI error became a leadership crisis
The issue was not only that Microsoft Copilot produced an erroneous result. The deeper problem was that the error appeared in sensitive material connected to a security decision, and that the use of AI was denied before it was admitted.
Home Secretary Shabana Mahmood addressed the case in the House of Commons. She blamed the ban on “confirmation bias” by the police and said the Amsterdam stories used in the decision were “exaggerated or simply untrue.”
Mahmood also underlined the contradiction at the center of the controversy: Guildford had said “AI tools were not used to prepare intelligence reports,” yet the force was now saying that an “AI hallucination” was responsible for the false West Ham entry.
Her judgment was severe. She called the episode a “failure of leadership” and said Guildford “no longer has my confidence.”
Conservatives also called for Guildford to go, with party leaders seeking his resignation. MP Nick Timothy focused especially on the danger of using hallucination-prone AI tools for security decisions.
Writing on X, Timothy said: “More detail on the misuse of AI by the police,” adding: “They didn’t just deny it to the home affairs committee. They denied it in FOI requests. They said they have no AI policy. So officers are using a new, unreliable technology for sensitive purposes without training or rules.”
The lesson for AI in sensitive decisions
The facts in this case show a narrow but important chain of failure. A false match appeared in police material. The false match was later tied to Microsoft Copilot. Senior leadership first said AI was not used, then admitted that it was.
For public bodies, the practical issue is verification. AI systems can produce confident but wrong claims, and sensitive decisions require a clear record of where information came from, how it was checked, and who approved its use.
The controversy also shows why AI policy cannot be treated as an internal technical detail. When a tool influences policing, security, or public access to an event, the standards around training, documentation, and review become part of public accountability.
In this case, the most damaging error was not complex. It was a claimed football match that did not exist. But because it appeared in a high-stakes setting, and because the explanation changed under scrutiny, it became a much larger question about AI use, institutional trust, and leadership.