Anthropic's fight over model distillation in China: What is confirmed, what is claimed, and what comes next
Anthropic's fight over model distillation in China: What is confirmed, what is claimed, and what comes next
Anthropic says three Chinese AI labs used fraudulent access to Claude at scale to improve their own models. That is a serious allegation with technical, commercial, and policy consequences. Right now, the public evidence is mostly company-reported, so the key is to separate what is confirmed from what is still a claim.
This goes beyond routine platform abuse. If true, this is both a governance issue and a national security concern. If any part is overstated, it could still influence export-control policy.
This breakdown separates confirmed facts from unresolved claims, then closes with practical steps for platform and security teams.
TLDR for busy readers
- Anthropic alleges three Chinese AI labs used fraudulent access to distill Claude capabilities.
- The most detailed evidence is company-reported, with Reuters providing external reporting context.
- The issue matters across security, competition, and export-control policy.
- Key unknowns remain, including independent forensic validation and legal outcomes.
- Teams should respond with practical abuse detection and precise, claims-safe communication.
What Anthropic says happened
In an official post, Anthropic says DeepSeek, Moonshot, and MiniMax ran industrial-scale extraction campaigns against Claude.
Key claims from Anthropic:
- More than 16 million Claude exchanges in total
- Roughly 24,000 fraudulent accounts
- Claimed attribution to specific labs with high confidence
- Focus on high-value capabilities: reasoning, tool use, coding
Anthropic describes this as illicit "distillation" at scale.
Important nuance: Anthropic also says distillation itself is a legitimate and common training technique when used on your own models. The dispute is about unauthorized extraction from a competitor and alleged terms violations.
Distillation volume at a glance (Anthropic claim data)
How to read this data:
- These numbers are from Anthropic's own reporting.
- Reuters reported the allegations, but did not independently audit technical telemetry.
- Treat these counts as credible claims under investigation, not adjudicated fact.
| Lab | Reported exchanges | Primary claimed targets |
|---|---|---|
| DeepSeek | >150,000 | Reasoning quality, reward-model style grading, policy-sensitive query rewrites |
| Moonshot | >3.4 million | Agentic reasoning, tool use, coding, data analysis, computer use, vision |
| MiniMax | >13 million | Agentic coding, orchestration, tool use |
Reported Claude exchanges by lab (Anthropic claims)DeepSeekMoonshotMiniMax0.15M+3.4M+13M+Source: Anthropic, Detecting and preventing distillation attacks (2026-02-23). Not independently audited.
How these campaigns allegedly worked
Anthropic's description points to a repeatable abuse pattern:
- Create or buy access through large batches of fraudulent accounts.
- Route requests through proxy infrastructure to reduce traceability.
- Send high-volume prompt sets focused on reasoning, coding, and tool use.
- Use outputs for post-training and reinforcement workflows.
- Pivot fast when a new model release exposes better capabilities.
Alleged campaign flow (based on Anthropic reporting)Fraudulentaccount setupProxy routingand maskingCapabilityextraction promptsDistillation /RL post-trainingRapid pivotto new modelSource basis: Anthropic "Detecting and preventing distillation attacks". Diagram is an editorial simplification.
What is confirmed versus what is still a claim
Confirmed
- Anthropic published the allegations on its official site.
- Anthropic says it detected large-scale abusive patterns and tightened controls.
- Anthropic's restrictions update says entities more than 50% owned, directly or indirectly, by companies headquartered in unsupported regions are prohibited.
- Anthropic ties this to broader policy arguments around chip export controls.
- Reuters reported the allegations and said the named companies did not immediately respond.
Claimed by Anthropic, not independently verified in full public detail
- Exact scale numbers by lab
- Specific attribution methodology outcomes for each lab
- Degree of capability transfer achieved by each campaign
Unknown right now
- Any full third-party forensic validation
- Any legal findings or enforcement outcomes tied to these campaigns
- Any published technical rebuttal from the named labs
Why this matters beyond one company
There are three layers here.
- Competitive integrity Frontier model outputs are expensive to produce. If a competitor can copy high-value behavior through unauthorized extraction, training timelines and cost structures change fast.
- Safety transfer risk Anthropic argues that distilled models may not carry over safeguard behavior. That could matter if advanced capabilities spread with weaker abuse controls.
- Policy leverage Anthropic uses this incident to support stricter export controls. That puts one incident class into a larger geopolitical debate about compute access and strategic advantage.
The policy and market tension
This is where the debate often splits.
Anthropic's policy case is straightforward: if large-scale illicit distillation is real, then export controls and access restrictions need stronger enforcement.
Critics raise a different concern: broad restrictions can become blunt tools. They may hurt legitimate research paths, reduce transparency, and further fragment global AI ecosystems.
Two realities can coexist here:
- Unauthorized extraction is a real risk.
- Overbroad policy responses can create new problems.
If you run a model or API platform: practical playbook
This incident is a strong stress test template.
Next 7 days checklist
- Define your top 3 abuse signatures for coordinated extraction.
- Set alert thresholds for high-volume, capability-focused prompt clusters.
- Run one red-team simulation for output-extraction behavior.
- Assign an incident owner and escalation path for suspected distillation campaigns.
- Draft one claims-safe incident communication template for legal and PR review.
Severity rubric (quick triage)
- Low: isolated suspicious prompts, low coordination signals.
- Medium: repeated patterns across multiple accounts, limited scale.
- High: clear multi-account coordination targeting model capabilities.
- Critical: industrial-scale abuse with rapid pivoting to new model releases.
KPIs to track
- Time-to-detect coordinated account behavior.
- False-positive rate for abuse classifiers.
- Suspicious traffic blocked pre-inference.
- Time-to-mitigate after first confirmed signal.
- Time-to-adapt controls after a major model release.
Baseline operating guidance
- Track coordinated behavioral fingerprints, not only account-level abuse.
- Watch for repeated capability-focused prompt clusters at scale.
- Harden account verification routes that attackers can industrialize.
- Share technical indicators with peers when abuse patterns cross providers.
- Be cautious with policy conclusions until independent validation is available.
If you are in policy or compliance: decision prompts
- What independent evidence threshold should trigger policy advocacy in your org?
- Which controls are reversible if applied too broadly?
- How will you separate fraud enforcement from broad geographic overreach?
- What transparency standard should accompany public attributions?
Open questions to watch
- Will independent third-party technical validation emerge?
- Will any legal or regulatory actions confirm or challenge key allegations?
- Will named labs issue detailed technical rebuttals?
- Will cross-provider intelligence sharing improve attribution confidence?
Confidence callout
- High confidence: Anthropic published these claims and policy framing in official posts.
- High confidence: Reuters reports the same allegations and lack of immediate response from named firms at publication time.
- Medium confidence: Full operational scope and attribution precision, because the public evidence set is still mostly company-reported.
- Low confidence: Sweeping geopolitical conclusions from this event alone.
Bottom line
This story is bigger than one vendor conflict. It is a live test of how frontier labs defend model behavior, how attribution claims are evaluated, and how fast policy reacts before independent verification is complete.
The right takeaway is straightforward: take the allegations seriously, stay precise about evidence, and update your view as third-party validation emerges.
Sources
Primary and high-signal sources:
- Anthropic, "Detecting and preventing distillation attacks" (official): https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks
- Anthropic, "Updating restrictions of sales to unsupported regions" (official): https://www.anthropic.com/news/updating-restrictions-of-sales-to-unsupported-regions
- Anthropic, "Supported countries and regions" (official): https://www.anthropic.com/supported-countries
- Anthropic, "A statement from Dario Amodei on Anthropic's commitment to American AI leadership" (official): https://www.anthropic.com/news/statement-dario-amodei-american-ai-leadership
- Anthropic, "AI Export Controls Framework Response" (official): https://www.anthropic.com/news/securing-america-s-compute-advantage-anthropic-s-position-on-the-diffusion-rule
- Reuters, Feb 23, 2026 coverage: https://www.reuters.com/world/china/chinese-companies-used-claude-improve-own-models-anthropic-says-2026-02-23/
- Reuters, Feb 12, 2026 OpenAI/DeepSeek memo context: https://www.reuters.com/world/china/openai-accuses-deepseek-distilling-us-models-gain-advantage-bloomberg-news-2026-02-12/