Anthropic's fight over model distillation in China: What is confirmed, what is claimed, and what comes next

Two giant rival AI robots facing each other in a cyberpunk city, representing US and China model competition

Anthropic's fight over model distillation in China: What is confirmed, what is claimed, and what comes next

Anthropic says three Chinese AI labs used fraudulent access to Claude at scale to improve their own models. That is a serious allegation with technical, commercial, and policy consequences. Right now, the public evidence is mostly company-reported, so the key is to separate what is confirmed from what is still a claim.

This goes beyond routine platform abuse. If true, this is both a governance issue and a national security concern. If any part is overstated, it could still influence export-control policy.

This breakdown separates confirmed facts from unresolved claims, then closes with practical steps for platform and security teams.

TLDR for busy readers

  • Anthropic alleges three Chinese AI labs used fraudulent access to distill Claude capabilities.
  • The most detailed evidence is company-reported, with Reuters providing external reporting context.
  • The issue matters across security, competition, and export-control policy.
  • Key unknowns remain, including independent forensic validation and legal outcomes.
  • Teams should respond with practical abuse detection and precise, claims-safe communication.

What Anthropic says happened

In an official post, Anthropic says DeepSeek, Moonshot, and MiniMax ran industrial-scale extraction campaigns against Claude.

Key claims from Anthropic:

  • More than 16 million Claude exchanges in total
  • Roughly 24,000 fraudulent accounts
  • Claimed attribution to specific labs with high confidence
  • Focus on high-value capabilities: reasoning, tool use, coding

Anthropic describes this as illicit "distillation" at scale.

Important nuance: Anthropic also says distillation itself is a legitimate and common training technique when used on your own models. The dispute is about unauthorized extraction from a competitor and alleged terms violations.

Distillation volume at a glance (Anthropic claim data)

How to read this data:

  • These numbers are from Anthropic's own reporting.
  • Reuters reported the allegations, but did not independently audit technical telemetry.
  • Treat these counts as credible claims under investigation, not adjudicated fact.
Lab Reported exchanges Primary claimed targets
DeepSeek >150,000 Reasoning quality, reward-model style grading, policy-sensitive query rewrites
Moonshot >3.4 million Agentic reasoning, tool use, coding, data analysis, computer use, vision
MiniMax >13 million Agentic coding, orchestration, tool use

Reported Claude exchanges by lab (Anthropic claims)DeepSeekMoonshotMiniMax0.15M+3.4M+13M+Source: Anthropic, Detecting and preventing distillation attacks (2026-02-23). Not independently audited.

How these campaigns allegedly worked

Anthropic's description points to a repeatable abuse pattern:

  1. Create or buy access through large batches of fraudulent accounts.
  2. Route requests through proxy infrastructure to reduce traceability.
  3. Send high-volume prompt sets focused on reasoning, coding, and tool use.
  4. Use outputs for post-training and reinforcement workflows.
  5. Pivot fast when a new model release exposes better capabilities.

Alleged campaign flow (based on Anthropic reporting)Fraudulentaccount setupProxy routingand maskingCapabilityextraction promptsDistillation /RL post-trainingRapid pivotto new modelSource basis: Anthropic "Detecting and preventing distillation attacks". Diagram is an editorial simplification.

What is confirmed versus what is still a claim

Confirmed

  • Anthropic published the allegations on its official site.
  • Anthropic says it detected large-scale abusive patterns and tightened controls.
  • Anthropic's restrictions update says entities more than 50% owned, directly or indirectly, by companies headquartered in unsupported regions are prohibited.
  • Anthropic ties this to broader policy arguments around chip export controls.
  • Reuters reported the allegations and said the named companies did not immediately respond.

Claimed by Anthropic, not independently verified in full public detail

  • Exact scale numbers by lab
  • Specific attribution methodology outcomes for each lab
  • Degree of capability transfer achieved by each campaign

Unknown right now

  • Any full third-party forensic validation
  • Any legal findings or enforcement outcomes tied to these campaigns
  • Any published technical rebuttal from the named labs

Why this matters beyond one company

There are three layers here.

  1. Competitive integrity Frontier model outputs are expensive to produce. If a competitor can copy high-value behavior through unauthorized extraction, training timelines and cost structures change fast.
  2. Safety transfer risk Anthropic argues that distilled models may not carry over safeguard behavior. That could matter if advanced capabilities spread with weaker abuse controls.
  3. Policy leverage Anthropic uses this incident to support stricter export controls. That puts one incident class into a larger geopolitical debate about compute access and strategic advantage.

The policy and market tension

This is where the debate often splits.

Anthropic's policy case is straightforward: if large-scale illicit distillation is real, then export controls and access restrictions need stronger enforcement.

Critics raise a different concern: broad restrictions can become blunt tools. They may hurt legitimate research paths, reduce transparency, and further fragment global AI ecosystems.

Two realities can coexist here:

  • Unauthorized extraction is a real risk.
  • Overbroad policy responses can create new problems.

If you run a model or API platform: practical playbook

This incident is a strong stress test template.

Next 7 days checklist

  1. Define your top 3 abuse signatures for coordinated extraction.
  2. Set alert thresholds for high-volume, capability-focused prompt clusters.
  3. Run one red-team simulation for output-extraction behavior.
  4. Assign an incident owner and escalation path for suspected distillation campaigns.
  5. Draft one claims-safe incident communication template for legal and PR review.

Severity rubric (quick triage)

  • Low: isolated suspicious prompts, low coordination signals.
  • Medium: repeated patterns across multiple accounts, limited scale.
  • High: clear multi-account coordination targeting model capabilities.
  • Critical: industrial-scale abuse with rapid pivoting to new model releases.

KPIs to track

  • Time-to-detect coordinated account behavior.
  • False-positive rate for abuse classifiers.
  • Suspicious traffic blocked pre-inference.
  • Time-to-mitigate after first confirmed signal.
  • Time-to-adapt controls after a major model release.

Baseline operating guidance

  1. Track coordinated behavioral fingerprints, not only account-level abuse.
  2. Watch for repeated capability-focused prompt clusters at scale.
  3. Harden account verification routes that attackers can industrialize.
  4. Share technical indicators with peers when abuse patterns cross providers.
  5. Be cautious with policy conclusions until independent validation is available.

If you are in policy or compliance: decision prompts

  • What independent evidence threshold should trigger policy advocacy in your org?
  • Which controls are reversible if applied too broadly?
  • How will you separate fraud enforcement from broad geographic overreach?
  • What transparency standard should accompany public attributions?

Open questions to watch

  • Will independent third-party technical validation emerge?
  • Will any legal or regulatory actions confirm or challenge key allegations?
  • Will named labs issue detailed technical rebuttals?
  • Will cross-provider intelligence sharing improve attribution confidence?

Confidence callout

  • High confidence: Anthropic published these claims and policy framing in official posts.
  • High confidence: Reuters reports the same allegations and lack of immediate response from named firms at publication time.
  • Medium confidence: Full operational scope and attribution precision, because the public evidence set is still mostly company-reported.
  • Low confidence: Sweeping geopolitical conclusions from this event alone.

Bottom line

This story is bigger than one vendor conflict. It is a live test of how frontier labs defend model behavior, how attribution claims are evaluated, and how fast policy reacts before independent verification is complete.

The right takeaway is straightforward: take the allegations seriously, stay precise about evidence, and update your view as third-party validation emerges.

Sources

Primary and high-signal sources:

  1. Anthropic, "Detecting and preventing distillation attacks" (official): https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks
  2. Anthropic, "Updating restrictions of sales to unsupported regions" (official): https://www.anthropic.com/news/updating-restrictions-of-sales-to-unsupported-regions
  3. Anthropic, "Supported countries and regions" (official): https://www.anthropic.com/supported-countries
  4. Anthropic, "A statement from Dario Amodei on Anthropic's commitment to American AI leadership" (official): https://www.anthropic.com/news/statement-dario-amodei-american-ai-leadership
  5. Anthropic, "AI Export Controls Framework Response" (official): https://www.anthropic.com/news/securing-america-s-compute-advantage-anthropic-s-position-on-the-diffusion-rule
  6. Reuters, Feb 23, 2026 coverage: https://www.reuters.com/world/china/chinese-companies-used-claude-improve-own-models-anthropic-says-2026-02-23/
  7. Reuters, Feb 12, 2026 OpenAI/DeepSeek memo context: https://www.reuters.com/world/china/openai-accuses-deepseek-distilling-us-models-gain-advantage-bloomberg-news-2026-02-12/