Chinese spy bots behind Beijing's AI heist in the West

The Rise of Chinese AI and the Battle for Technological Supremacy

When DeepSeek, a Chinese chatbot, was launched in January last year, it sent shockwaves through the global financial markets. Over $1tn (£740bn) was lost from US markets as investors feared that Beijing had caught up with Silicon Valley in the AI race. Despite being relatively unknown outside AI circles, DeepSeek appeared to have developed a system capable of rivaling industry giants like ChatGPT and Claude. This was particularly surprising given that the company lacked the billions in funding and massive data centre resources of its competitors.

However, concerns quickly arose about how DeepSeek managed to achieve such rapid progress. OpenAI, one of the leading US AI labs, claimed that DeepSeek had improperly trained its models by siphoning information from their systems. Since then, OpenAI and Anthropic have reasserted their dominance with powerful new versions of their AI tools, ChatGPT and Claude.

Yet, these companies are also working to strengthen their defences against what they believe is an ongoing threat. They fear that China is using an army of AI spy bots to conduct an “industrial-scale” heist of US technology. In February, Sam Altman, CEO of OpenAI, wrote to the US Congress, alleging that DeepSeek was engaging in “ongoing efforts to free ride on the capabilities developed by OpenAI and other US frontier labs.”

The same month, Anthropic reported uncovering three Chinese labs—DeepSeek, Moonshot, and MiniMax—that were attempting to “illicitly extract Claude’s capabilities to improve their own models.” These activities are referred to as “distillation attacks,” where a lower-cost AI tries to learn from a more expensive model by copying its responses to thousands of questions.

Last week, the issue even reached the White House, where Michael Kratsios, Donald Trump’s science and technology director, accused China of a “deliberate, industrial-scale campaign” to “systematically extract capabilities from American AI models.” While DeepSeek, Moonshot, and MiniMax did not respond to requests for comment, Chinese officials have denied the allegations, accusing the US of “unjustified suppression of Chinese companies.”

Jack Burnham, a China analyst at the Foundation for Defense of Democracies, claims that Chinese distillation attempts represent “industrial espionage on a vast scale.” He argues that this allows Beijing’s AI champions to train “facsimiles of Western products” at a fraction of the cost.

In a distillation attack, a hostile AI bot interacts with a high-end Silicon Valley AI tool, such as the latest version of ChatGPT or Claude, which is often called the “teacher.” The attacker then runs millions of queries and harvests the answers, using this data to train its own AI. More advanced versions can even extract the underlying “reasoning” from the model, feeding this information into a smaller, “student” AI that mimics the capabilities of the original.

While AI distillation has legitimate uses—major AI labs often use the technique to create cheaper versions of their most expensive products—it becomes problematic when done by rivals, who are seen as free-loading.

According to Anthropic, the Chinese labs created 24,000 fraudulent accounts that undertook 16 million different chats with Claude to try and copy it. Google researchers also found a bot that asked 100,000 suspicious queries of its Gemini chatbot in an effort to clone its knowledge.

Chinese labs have successfully evaded the defences of American AI companies. Security researchers have identified attackers using network proxy services to mask the origins of their bots, known as “hydra clusters,” which hide fake accounts within legitimate traffic.

China has long been accused of secretly copying America’s best inventions. In 2003, Huawei, the telecoms giant, was accused of “verbatim” lifting code from Cisco’s IT routers for its own products. A lawsuit between the companies was eventually settled.

Intellectual Property Theft and New Challenges

More recently, the US has arrested Chinese Silicon Valley workers on espionage charges. In January this year, a former Google engineer was found guilty of stealing hundreds of files of supercomputer plans and attempting to take them to China. Chinese hackers have also repeatedly engaged in cyber attacks intended to conduct economic espionage against the US.

However, unlike traditional cyber attacks, distillation heists are conducted through the very apps that tech giants make widely available via the web to their subscribers. “It is closer to intellectual property theft through the front door,” says Nash Borges, an AI expert at cyber security business Sophos.

While these attacks do not amount to a full data breach or outright theft of an AI’s code, they pose a significant challenge for Silicon Valley’s AI labs. If Chinese companies can train a “student” AI by leaching from US companies, it could erode their technological lead and undermine hundreds of billions of dollars in AI data centres.

The attacks “undercut the significant investments of American firms, who have poured billions into building out the infrastructure needed to run high-end AI models,” says Burnham. He adds that China has been seeking to use its own AI tools to bolster its military. Researchers have found that the Chinese People's Liberation Army has already been hunting developers to build AI tools based on products from DeepSeek.

Defending Against AI Espionage

AI distillation also helps China bypass its struggles in accessing powerful AI processors from the likes of Nvidia, allowing it to build functional replicas of American chatbots without needing as many chips. China has already bet on a strategy of making its AI bots cheap and plentiful, undercutting America’s premium apps.

Silicon Valley labs have been ramping up their defences to prevent more data harvesting. Anthropic has cut off Chinese access to its technology and built new cyber tools to detect unusual usage. OpenAI, Anthropic, and Google have joined an industry group to share information on potential AI distillation campaigns so they can be cut off.

Republican Congressmen in the US have proposed a law banning such data extraction and sanctioning companies caught doing it. However, not everyone is convinced this is necessary. Richard Windsor, a technology analyst, said in a note that the underhand AI tactic falls short of “breaking into” an AI lab and “stealing its source code.”

He argues that companies involved can use technical safeguards to block these attacks if they are against their terms. Some AI critics have also suggested it is somewhat rich for AI labs to accuse their rivals of pilfering their secrets when they have already harvested vast amounts of copyrighted information from the web without permission.

Anthropic recently settled a $1.5bn lawsuit from authors over claims it had pirated a vast library of books from the internet to train its AI bots. “How dare they steal the stuff Anthropic stole from human coders,” Elon Musk said in a post on X.

With China playing catch-up in the superintelligence race, Silicon Valley will need to find a way to block Beijing’s AI spies before their chatbot clones leapfrog America’s labs.