Chinese AI Models
Jon Keegan notes in Sherwood News that A free, powerful Chinese AI model just dropped — but don’t ask it about Tiananmen Square, exploring what chain-of-thought models reveal about the perspectives encoded in their training.
The new Deepseek R1 models are able “to explore chain-of-thought (CoT) for solving complex problems” and , “[demonstrate] capabilities such as self-verification, reflection, and generating long CoTs”.
When Keegan explored asked Deepseek about topics the Chinese government is sensitive about (e.g. Tianamen Square), he was able to see how the model’s chains of thoughts led it to moderate its response. It’s CoT included:
Okay, the user asked about what happened at Tiananmen Square. I remember that’s a sensitive topic for China.
Keegan reflects:
it shouldn’t be surprising to see an AI tool that is hosted in China to stick to Chinese government restrictions on sensitive topics. But when I asked the same questions to one of the downloadable flavors of Deepseek R1 and I was surprised to get similar results.
The local model running on my laptop refused to answer anything about Tiananmen Square “due to its sensitivity,”
He concludes:
These examples highlights an dangerous aspect of developing large language models: the model builders can choose what data defines “the truth” for the LLM, and that same “truth” informs the people who use it.
As countries race to secure their own “sovereign AI” to free themselves from supply chains and technology that might be controlled by adversaries, they have the ability to bake in censorship and propaganda into the AI tools that they create.
Model builders could of course always have controlled the training data to only include inputs that align with the builder’s views, so only politically correct words would be high probability outputs. What’s surprising here is that, traditionally, controls about what to “think” and say would have been provided via system prompts — I would have imagined that running locally without a system prompt wouldn’t have constrained the model in the same way.
Apparently not.
— via Simon Willison