Anthropic Exposes Industrial-Scale AI Theft: 16 Million Claude Queries Used by Chinese Firms

Anthropic has uncovered coordinated campaigns by three Chinese AI labs to extract Claude's capabilities through fraudulent means. The scale reveals a systematic effort to bypass safety measures and regional restrictions while stealing proprietary AI knowledge.

The Scale of AI Model Theft

According to Anthropic, DeepSeek, Moonshot AI, and MiniMax generated over 16 million exchanges with Claude through approximately 24,000 fraudulent accounts. Each lab targeted specific capabilities:

MiniMax led with 13 million exchanges focusing on agentic coding and tool orchestration
Moonshot AI conducted 3.4 million exchanges targeting agentic reasoning, tool use, and computer vision
DeepSeek performed over 150,000 exchanges to extract reasoning capabilities

The coordinated nature becomes clear through shared infrastructure. As reported by TheHackerNews, the labs employed "hydra cluster" architectures—sprawling networks of fraudulent accounts distributed across APIs and cloud platforms. One proxy network managed over 20,000 fraudulent accounts simultaneously.

How Distillation Attacks Work

Knowledge distillation traditionally serves legitimate purposes. AI companies routinely distill their own models to create smaller, efficient versions. The technique involves training a weaker model on outputs from a more capable system.

These attacks weaponize the process. DeepSeek's approach focused on chain-of-thought elicitation. The lab crafted prompts asking Claude to "articulate internal reasoning behind completed responses step by step," generating high-quality training data at scale.

Common mistake: Treating API rate limits as sufficient protection. Sophisticated attackers distribute requests across thousands of accounts and proxy networks, evading standard detection methods.

Detection and Attribution Methods

Anthropic attributed each campaign "with high confidence" through multiple indicators:

IP address correlation revealing shared infrastructure
Request metadata showing coordinated patterns
Payment methods linking fraudulent accounts
Industry partner corroboration observing the same actors

Key point: MiniMax's behavior demonstrated real-time adaptation. When Anthropic released a new Claude model, MiniMax "pivoted within 24 hours, redirecting nearly half their traffic" to the updated version.

National Security Implications

The removal of safety guardrails poses the gravest concern. Anthropic warns that "models built through illicit distillation are unlikely to retain safeguards," potentially enabling:

Offensive cyber operations
Disinformation campaigns at scale
Mass surveillance systems
Military applications without ethical constraints

CNBC reports that Anthropic flagged evidence of distillation by Chinese firms since early last year, coinciding with DeepSeek's first model launch.

Industry Response and Countermeasures

Anthropic deployed multiple defensive strategies:

New behavioral detection systems identifying anomalous usage patterns
Strengthened account verification processes
Intelligence sharing with industry peers and authorities
Product and API-level safeguards reducing distillation effectiveness

Proven approach: Correlating traffic patterns across providers. Multiple AI companies now share threat intelligence to identify coordinated campaigns targeting multiple models simultaneously.

The cost advantage drives these attacks. Distillation allows rival firms "to acquire powerful capabilities from other labs in a fraction of the time, and at a fraction of the cost" compared to independent development.

Looking Forward

Regional access restrictions exist for reasons beyond commercial competition. All three targeted labs operate from China, where Anthropic prohibits service use due to "legal, regulatory, and security risks."

In practice, preventing complete model extraction remains impossible. The fundamental nature of AI APIs—providing useful outputs—creates inherent vulnerability. Defense focuses on raising costs and reducing quality of extracted capabilities.

If it works — it is correct. The industry now recognizes distillation attacks as a persistent threat requiring continuous adaptation. AI labs must balance accessibility for legitimate users against sophisticated extraction campaigns.

Frequently Asked Questions

How do companies use fraudulent accounts and proxy services to bypass API detection systems at scale?

Attackers employ "hydra cluster" architectures with thousands of accounts distributed across multiple cloud platforms and proxy networks. One detected network managed over 20,000 fraudulent accounts simultaneously, routing requests through commercial proxy services to mask origins and evade rate limits.

What specific prompts and techniques are used to extract reasoning traces and agentic capabilities from frontier models?

DeepSeek specialized in chain-of-thought elicitation, crafting prompts that asked Claude to "articulate internal reasoning behind completed responses step by step." Moonshot later attempted "to extract and reconstruct Claude's reasoning traces" while targeting agentic reasoning, tool use, and computer vision capabilities.

How can API providers distinguish between legitimate high-volume usage and coordinated distillation attacks in their traffic patterns?

Detection relies on multiple signals: synchronized traffic patterns, shared payment methods, coordinated timing across accounts, and rapid pivoting behavior when new models release. MiniMax's immediate traffic redirection within 24 hours of new model releases exemplified suspicious adaptation patterns.

What are the technical indicators that reveal a distillation campaign is underway, and how are they shared between AI labs?

Key indicators include IP address correlation, request metadata patterns, infrastructure fingerprints, and behavioral anomalies. AI companies now share threat intelligence through industry partnerships, allowing corroboration of actors targeting multiple providers simultaneously.

How quickly can competitors pivot their distillation strategy when a target model releases a new version?

MiniMax demonstrated remarkably rapid adaptation—within 24 hours of a new Claude model release, they redirected nearly half their traffic to target the updated version. This speed indicates sophisticated monitoring systems and automated infrastructure capable of immediate strategic shifts.

Sources: