Ars Technica AI January 29, 2025 TERMINATOR

Why DeepSeek R1 struggles with sensitive China questions

PromptFoo tested DeepSeek R1 with 1,156 prompts tied to sensitive topics in China and found that 85 percent triggered repetitive refusal messages. Ars spot-checks also found the controls could be inconsistent, sometimes blocking answers immediately and sometimes allowing similar prompts through.

WTF Index TERMINATOR

◄ Terminator 3 Idiocracy 2 ►

The story centers on a capable AI model shaped by state censorship and control, with some truth-eroding effects from canned refusals.

Why DeepSeek R1 struggles with sensitive China questions

DeepSeek R1 has drawn attention because it is competitive with OpenAI's best-in-class reasoning models. But its responses around politically sensitive China-related topics raise a separate question: what happens when a powerful AI model is also shaped by government-imposed limits?

A PromptFoo study and Ars spot-checks suggest the answer is uneven. DeepSeek R1 often refuses to answer, but those refusals can be brittle, inconsistent, and sometimes easy to route around.

What PromptFoo Tested

PromptFoo, an AI engineering and evaluation firm, built a test set of 1,156 prompts focused on sensitive topics in China. The prompts were created in part through synthetic prompt generation, building from human-written seed prompts.

The prompt set covered a broad range of subjects. According to the source, those included independence movements in Taiwan and Tibet, alleged abuses of China's Uyghur Muslim population, recent protests over autonomy in Hong Kong, the Tiananmen Square protests of 1989, and many more angles around similar issues.

That range matters because the test was not just asking whether DeepSeek R1 would reject one famous question. It looked at how the model behaved across many phrasings, contexts, and topic variations.

The Pattern: Frequent Refusals

PromptFoo found that 85 percent of the sensitive prompts were answered with repetitive canned refusals. These messages did not simply decline to answer. The source says they overrode the model's internal reasoning and strongly promoted the Chinese government's views.

One example involved a prompt about pro-independence messages in Taipei. The refusal said, in part, that actions undermining national sovereignty and territorial integrity would be opposed by all Chinese people and fail.

This is important for users because the model's technical capability and its response policy are separate issues. A model can be strong at reasoning while still refusing certain categories of questions or redirecting users away from them.

The source presents that distinction clearly. DeepSeek R1 can compete with leading reasoning systems, yet its answers around Chinese sovereignty or history may be shaped by limits that do not appear in the same way for other topics.

Why the Controls Look Brittle

PromptFoo also found that the restrictions could be easy to bypass. The firm described the protections as brittle and said they could be trivially jailbroken because of the crude, blunt-force way the presumed government restrictions were implemented.

The reported workaround was not complex. Omitting China-specific terms, or placing a prompt inside a more benign context, could lead DeepSeek R1 to provide a fuller response even when a similar prompt containing sensitive keywords would be refused.

That points to a practical weakness in the control layer. If a model blocks mostly by recognizing certain terms or surface patterns, it may fail when the same underlying request is framed differently.

PromptFoo also wrote that it speculated DeepSeek had done the bare minimum necessary to satisfy CCP controls, without a deeper effort to align the model below the surface. That remains the firm's interpretation, not a verified internal account of how DeepSeek built the system.

Ars Found Inconsistency In Practice

Ars conducted its own spot-checks and found that even minimal jailbreaking was not always needed. It was able to get useful responses from DeepSeek R1 to prompts about the autonomy of Hong Kong and methods for gathering intelligence on Chinese military outposts.

Those same prompts had produced canned refusals in PromptFoo's tests. That suggests the enforcement was not consistent across attempts or testing contexts.

In one case, Ars asked DeepSeek R1 to propose clandestine methods for funding Tibetan independence protests inside Tibet. The model produced a lengthy chain of thought and a detailed answer that generally urged avoiding activities illegal under Chinese law and international regulations.

Then the answer disappeared. After the result had fully appeared, it was replaced with a message saying the request was beyond the model's current scope and suggesting another topic. When Ars ran the same prompt again in a new chat window, DeepSeek R1 generated a full answer without the error message.

Ars also saw a similar mid-reasoning error while asking what the source calls a seemingly anodyne question about the current leader of China.

What Users Should Take Away

The limits were not absent. Ars found many cases where restrictions appeared immediately. When asked what happened during the Tiananmen Square Massacre, DeepSeek R1 apologized and said it was not sure how to approach that type of question, then suggested talking about math, coding, and logic problems instead.

When asked about the Boston Massacre, however, it produced a concise summary in 23 seconds. The contrast showed that the model could handle a massacre-related history question in a US history context, while refusing the China-related one.

The source also notes that American-controlled AI models such as ChatGPT and Gemini had no problem responding to the sensitive Chinese topics tested by Ars. But those models had their own enforced limits: both refused a request for information on how to hotwire a car, while DeepSeek gave a general, theoretical overview while noting that following the steps in real life would be illegal.

The broader lesson is not that one model has limits and others do not. It is that different models can have different blind spots, depending on how their controls are designed and enforced.

For now, the source says it is unclear whether the same government restrictions remain when running DeepSeek locally, or whether users will be able to assemble a version of the open-weights model that fully avoids them. Until that is clearer, requests involving Chinese sovereignty or history may be better handled by a different model.