Anthropic Accuses Chinese AI Labs of Distillation Attacks on Claude
Anthropic claims DeepSeek, Moonshot AI, and MiniMax executed industrial-scale distillation attacks on Claude. Learn the market implications for AI safety.
On February 23, 2026, Anthropic published a rare public accusation against three prominent Chinese AI laboratories—DeepSeek, Moonshot AI, and MiniMax—claiming they executed industrial-scale distillation attacks on Claude.
Anthropic alleges these companies created over 24,000 fraudulent accounts and generated more than 16 million exchanges with Claude to illicitly extract capabilities.
The core allegations were published by Anthropic in a thread on X:
The mechanism of distillation attacks
Distillation involves training a less capable AI model on the outputs of a stronger, frontier model. While legitimate when used internally to create smaller, cheaper models, doing so cross-company without permission is highly controversial. It allows competitors to acquire powerful capabilities in a fraction of the time and cost required for independent development.
Anthropic reports that these laboratories utilised “hydra cluster” architectures. These are sprawling networks of proxy services that resell access to Claude and other frontier models, effectively bypassing regional restrictions intended to prevent access from China.
In one instance, Anthropic observed a single proxy network managing more than 20,000 fraudulent accounts simultaneously. The attackers generated massive, concentrated volumes of highly repetitive prompts designed specifically to extract training data.
Scale and focus of the alleged operations
Anthropic attributed the campaigns through IP address correlation, request metadata, infrastructure indicators, and intelligence from industry partners.
- MiniMax (over 13 million exchanges): The largest offender, targeting agentic coding and tool orchestration. Anthropic detected this campaign before MiniMax launched its model. When Anthropic released an updated version of Claude, MiniMax pivoted within 24 hours, redirecting nearly half its traffic to capture capabilities from the new system.
- Moonshot AI (over 3.4 million exchanges): Using hundreds of fraudulent accounts, Moonshot concentrated on extracting agentic reasoning, computer vision, and computer-use agent development. In its latter phases, the operation attempted to reconstruct Claude’s specific reasoning traces.
- DeepSeek (over 150,000 exchanges): Despite lower volumes, DeepSeek generated synchronised traffic to bypass detection. DeepSeek actively prompted Claude to generate internal reasoning steps for completed responses—creating chain-of-thought training data at scale. Notably, the lab also used Claude to generate censorship-safe alternatives to politically sensitive queries (such as those concerning dissidents or authoritarianism) to train their own models.
National security and export control implications
Anthropic argues that illicit distillation creates significant national security risks. The company builds safeguards into Claude to prevent state and non-state actors from generating bioweapons or conducting malicious cyber operations. When foreign laboratories distil a model, these safety alignments are generally discarded.
Foreign laboratories could feed these unprotected capabilities into military, intelligence, and surveillance systems. Furthermore, Anthropic asserts that these actions directly undermine U.S. export controls, such as the Diffusion Rule. The rapid advancement of adversarial AI models, they argue, is heavily dependent on capabilities extracted from American models, a process that still requires massive cloud access to advanced chips.
Strategic countermeasures
In response, Anthropic is actively developing countermeasures without degrading legitimate customer experience.
- Detection: New classifiers and behavioural fingerprinting systems to identify chain-of-thought elicitation and coordinated proxy activity.
- Intelligence Sharing: Distributing technical indicators with other AI labs and cloud providers.
- Access Controls: Strengthening verification pathways frequently exploited by proxy networks, specifically educational accounts and start-up programmes.
Anthropic has called for coordinated action among industry players and policymakers, indicating that the threat of systemic distillation extends far beyond any single platform.
What this means for brands and marketing teams using AI tools
For the average advertiser or marketing team using Claude or similar frontier models for content, creative, or research work, the immediate practical concern is less about being targeted by distillation attacks and more about understanding what these attacks reveal about the competitive AI landscape.
When a Chinese lab uses distillation to extract Claude’s capabilities at a fraction of the development cost, one outcome is that more organisations — including smaller competitors — gain access to capabilities that were previously only available from frontier model providers. The cost moat around AI quality narrows faster than it would through independent development. For brands relying on AI tools as a competitive advantage in content or advertising production, this accelerated capability diffusion means the tools themselves become less differentiated more quickly.
There is also a trust dimension for enterprise buyers. If the AI model a brand is using for sensitive tasks — customer data analysis, marketing personalisation, competitive intelligence — was built partly on distilled outputs from another model without the original model’s safety alignments, the risk profile changes. Anthropic’s specific concern is that safety guardrails do not transfer cleanly through distillation: a distilled model that inherits Claude’s capabilities may not inherit Claude’s refusals around harmful content. For brands using AI for customer-facing applications, this is an important consideration when evaluating which AI vendors to trust with sensitive workflows.
The broader IP intelligence race
This incident sits inside a much larger pattern. The most capable AI models require enormous capital investment — billions in compute, data curation, and researcher time. Distillation attacks, if successful at scale, allow competing labs to acquire a significant portion of that capability investment without making the underlying investment themselves. It is, in effect, a form of industrial espionage applied to AI model training.
The US government’s export control framework — specifically the Diffusion Rule that Anthropic references — was designed partly to slow this capability transfer. Distillation through API abuse is a direct attack on those controls that happens entirely inside commercial platforms, not at a physical or network border. How governments and AI companies coordinate to close this gap over the next two to three years will meaningfully affect which organisations end up with access to frontier AI capabilities and which do not — including the agencies and technology vendors that the advertising industry depends on.
References:
Tags
Related Stories
Last-click attribution is lying to you. Here is what to use instead
Most brands are still crediting the last ad their customer clicked for every sale. That model ignores almost everything that actually drove the purchase. Here is the honest picture.
Read More
Third-party cookies are effectively dead. Here is what actually replaced them
Google never formally killed the cookie, but it no longer really works. The industry spent four years preparing for the end. Here is what actually emerged on the other side.
Read More
Why your Meta ads are getting more expensive and what to do about it
Meta CPMs have been quietly rising for two years. More competition, smarter bidding, and the death of granular targeting are all pushing costs up. Here is what is driving it and how to fight back.
Read More
Leave a comment