Raven Sentry began as a practical response to a worsening intelligence problem in Afghanistan. As NATO forces gradually reduced their troop strength in 2019, the US military had to keep assessing Taliban attack risks with fewer resources and rising violence.
The result was an AI warning system built by a small team of intelligence officers. It did not replace analysts. It helped them sort through large volumes of public and classified signals faster, then decide where deeper collection should be focused.
Why the project started
In October 2019, a team known as the "nerd locker" started developing Raven Sentry. The goal was specific: assess the risk of attacks on district or provincial centers and estimate possible casualties.
The system drew on open sources such as weather reports, social media posts, news and commercial satellite images. It also used historical attack information, including patterns that reached back to the Soviet occupation of Afghanistan in the 1980s.
Colonel Thomas Spahr, who led the experiment, wrote in the US Army War College's Parameters journal that some historical similarities were striking: "In some cases, modern attacks occurred in the exact locations, with similar insurgent composition, during the same calendar period, and with identical weapons to their 1980s Russian counterparts.,"
That historical work mattered because the team was not just looking for isolated signals. It was trying to identify combinations of conditions that had appeared before attacks in the past and might matter again.
How Raven Sentry processed risk
The first task was to make OSINT, or Open-Source Intelligence, usable for a machine learning system. Public information from sources such as newspapers and social networks had to be converted into machine-readable data.
Analysts also broke historical events into individual components and labeled them. That gave the model structured examples it could compare against new incoming signals.
The team then added more indicators. These included surrounding activities in mosques, madrassas, insurgent routes and known meeting points. The system also used influence data sets covering weather conditions and political stability.
Weather and visibility played a role in the model. According to Raven Sentry, attacks were more likely when the temperature was above 4°C, the moon brightness was below 30 percent, and it was not raining.
The prototype was trained on three declassified databases of historical attacks. It also monitored 17 commercial geodata sources, OSINT reports and GIS (Global Information Systems) datasets.
Raven Sentry did not issue warnings from one data point alone in most cases. Multiple anomalies were usually needed before the risk threshold was crossed. When that happened, the system could raise the risk level for warning named areas of interest, or WNAIs, and analysts could then consider what measures to take.
Why human trust still mattered
The project was embedded in a special unit with a culture that supported experimentation. But the system still had to earn confidence from the people who would use it.
Analysts working on Raven Sentry were required to take regular shifts in the operations center. That helped them understand mission requirements and build trust with the wider team.
Colonel Spahr described the relationship plainly: "Trust in the people running the system led to trust in the system’s output,"
That point is central to what Raven Sentry became. The model was useful because it was tied to analyst judgment, operational context and follow-up review. It was not treated as an independent authority.
By October 2020, the model had reached 70 percent accuracy. Spahr told The Economist that this was similar to human analyst performance, "just at a much higher rate of speed,"
The system’s output was used to focus attention. Analysts could use it to deploy classified systems such as spy satellites or intercepted communications more precisely, rather than treating every possible signal with the same priority.
What the experiment showed
Raven Sentry was described by Colonel Spahr as "learning on its own," and "getting better and better by the time it shut down". Its short operational life still produced lessons about how AI can support intelligence work, especially when analysts face large volumes of sensor data.
In the three years since Raven Sentry was discontinued, military and intelligence agencies have put many resources into AI-assisted early detection of attacks. A source from British Defence Intelligence told The Economist, "If we’d have had these algorithms in the run-up to the Russian invasion of Ukraine, things would have been much easier,"
At the same time, the project exposed clear limits. Colonel Spahr warned that adversaries adapt: "Just as Iraqi insurgents learned that burning tires in the streets degraded US aircraft optics or as Vietnamese guerrillas dug tunnels to avoid overhead observation, America’s adversaries will learn to trick AI systems and corrupt data inputs."
The larger outcome in Afghanistan also matters. The Taliban gained the upper hand despite the advanced technology used by the US and NATO. Raven Sentry could improve analysis, but it could not decide the conflict.
The unresolved question
Colonel Spahr summarized the experiment with a restrained conclusion: "Raven Sentry made the analysts more efficient but could not replace them,"
The next question is how far such systems may go as warfare speeds up and adversaries adopt AI. Spahr raised the possibility that the US military may move toward an on-the-loop position, where humans monitor and check outputs while machines make predictions and perhaps order action.
That is the central tension Raven Sentry leaves behind. The system showed that AI can help analysts move faster and focus scarce resources. It also showed that speed, accuracy and automation do not remove the need for human judgment, trust and caution.