MIT Tech Review AI April 15, 2025 TERMINATOR

Why military AI is moving closer to command decisions

The Pentagon is expanding its use of generative AI beyond older computer vision systems and into surveillance analysis. The shift raises unresolved questions about human oversight, classification and how far AI should move up the military decision chain.

WTF Index TERMINATOR

◄ Terminator 4 Idiocracy 0 ►

The story centers on generative AI moving into military surveillance, intelligence analysis and decision support near the kill chain, with unresolved oversight risks.

Why military AI is moving closer to command decisions

The US military’s use of artificial intelligence is entering a new stage. After years of using older AI tools such as computer vision to analyze drone imagery, the Pentagon is now pushing generative AI into work that includes surveillance, intelligence analysis and decision support.

The change is not theoretical. Two US Marines deployed in the Pacific last year used a chatbot-style generative AI interface to search intelligence while conducting training exercises from South Korea to the Philippines. Their role was to analyze surveillance and warn superiors about possible threats to the unit.

A New Phase For Military AI

The earlier phase of the Pentagon’s AI effort began back in 2017, when systems such as Project Maven focused on computer vision for drone footage and target identification. The current phase is different because it relies on large language models that can engage through conversational interfaces similar to ChatGPT.

This shift began under the Biden administration, but the source article describes fresh urgency from Elon Musk’s DOGE and Secretary of Defense Pete Hegseth, who are pushing for AI-fueled efficiency. That urgency matters because generative AI is not being discussed only as an administrative tool. It is being tested in areas where errors can carry high geopolitical stakes.

The central tension is clear: supporters see AI as a way to improve accuracy and reduce civilian deaths, while many human rights groups argue that the opposite risk is real. The debate becomes sharper as AI moves closer to the so-called kill chain, where systems may not only analyze military data but also suggest actions, including generating lists of targets.

The Human Review Problem

Defense-tech companies often point to the idea of a human in the loop. In simple terms, the AI performs certain tasks, while a person reviews the output before action is taken. The phrase is meant to signal control, accountability and a safeguard against both deadly mistakes and more routine failures.

But the safeguard depends on whether humans can realistically understand and check the system’s work. Heidy Khlaaf, chief AI scientist at the AI Now Institute and a former leader of safety audits for AI-powered systems, warns that review may be much harder than the phrase suggests.

"'Human in the loop' is not always a meaningful mitigation," she says.

Her concern is that an AI model may draw on thousands of data points to reach an output. If that happens, the human reviewer may be asked to verify a conclusion without being able to inspect the underlying reasoning in any practical way. As AI systems depend on more data, that review burden grows.

This creates a difficult question for military AI: if a human is formally responsible for checking the output, but cannot meaningfully evaluate the data trail behind it, how strong is the safeguard? The source does not answer that question. It makes clear that the issue is still open.

Classification Gets More Complicated

Generative AI also challenges older assumptions about classified information. In the Cold War era of US military intelligence, information could be gathered through covert means, written into reports by experts in Washington and stamped Top Secret, with access limited to people with proper clearances.

Big data already disrupted that model. Generative AI adds another layer because it can analyze large volumes of material and produce new summaries or conclusions. One specific problem is classification by compilation.

The idea is straightforward. Many unclassified documents may each contain separate details about a military system. Taken alone, those details may not reveal protected information. Combined, they could expose something that would otherwise be classified.

For years, it was reasonable to assume that no person would connect every relevant detail. Large language models are built for exactly that kind of pattern-finding. That makes it harder to decide whether a single document, a group of documents or an AI-generated analysis should be classified.

Chris Mouton, a senior engineer for RAND who recently tested how well suited generative AI is for intelligence and analysis, described the unresolved nature of the problem.

"I don’t think anyone’s come up with great answers for what the appropriate classification of all these products should be," says Chris Mouton.

Underclassifying information can create a US security concern. Overclassifying information has also drawn criticism from lawmakers. The defense giant Palantir is positioning itself to help by offering AI tools that assess whether a piece of data should be classified, and it is also working with Microsoft on AI models that would train on classified data.

How Far Should AI Move Up The Chain?

The Pentagon’s adoption of AI has, in some ways, followed the pattern seen in consumer technology. When phone apps became better at recognizing people in photos, the military launched Project Maven to analyze drone footage. Now that large language models are entering work and personal life through interfaces such as ChatGPT, the Pentagon is tapping related models to analyze surveillance.

The next consumer trend is agentic AI, meaning models that can converse, analyze information and perform actions on a user’s behalf. Another trend is personalized AI, where models learn from private data to become more useful.

The source article says signs point to military AI following this trajectory too. A report published in March from Georgetown’s Center for Security and Emerging Technology found a surge in military adoption of AI to assist in decision-making. Its authors wrote that military commanders are interested in AI’s potential to improve decision-making, especially at the operational level of war.

In October, the Biden administration released its national security memorandum on AI, which provided some safeguards for these scenarios. The memo has not been formally repealed by the Trump administration. President Trump has indicated that competitive AI in the US needs more innovation and less oversight.

That leaves a practical question at the center of the debate: how high up the military decision chain should generative AI go? The answer will shape whether these systems remain tools for analysis or become deeper participants in high-stakes, time-sensitive decisions.

The Questions That Still Matter

The current phase of military AI is not just about adopting a new software interface. It is about placing generative AI inside surveillance workflows, intelligence analysis and the broader structure of military decision-making.

Three questions now define the issue:

Can human reviewers meaningfully check AI systems that rely on thousands of data points?
How should officials classify AI-generated analysis built from growing volumes of data?
Where should the boundary sit between AI assistance and command-level judgment?

The source does not present final answers. It shows a military AI push that is already underway, with generative systems moving from experimental tools toward more consequential roles. The open questions are not side issues. They are the conditions that will determine whether this new phase strengthens decision-making or adds new risks at the worst possible moments.