Anthropic alleges large-scale distillation campaigns targeting Claude

Anthropic has accused three Chinese AI developers of running large-scale campaigns to illicitly extract capabilities from its Claude model to improve their own systems. The company claims DeepSeek, Moonshot, and MiniMax used a distillation technique, where a less capable model is trained on the outputs of a more advanced one.

More than 16 million interactions were generated with Claude through around 24,000 fraudulent accounts, in violation of Anthropic’s terms of service and regional access restrictions.

Anthropic said it does not offer commercial access to Claude in China, nor to subsidiaries of these companies operating outside the country.

How Claude’s capabilities were extracted at scale

Anthropic said the three distillation campaigns followed a similar playbook, where they used fraudulent accounts and proxy services to access Claude at scale while evading detection, and targeting Claude’s agentic reasoning, tool use, and coding capabilities.

The DeepSeek campaign involved over 150,000 exchanges, focused on extracting reasoning capabilities across diverse tasks. The activity generated synchronized traffic across accounts, with identical patterns, shared payment methods, and coordinated timing suggested load balancing to increase throughput, improve reliability, and avoid detection. 

Moonshot AI’s activity involved over 3.4 million exchanges targeting agentic reasoning and tool use, coding and data analysis, computer-use agent development, and computer vision to reconstruct Claude’s reasoning traces. MiniMax was the largest of the three, involving more than 13 million exchanges, and was squarely targeted at agentic coding and tool use and orchestration. Detected while the campaign was active, Anthropic said MiniMax redirected nearly half of its traffic to Claude’s newly released model within 24 hours.

To carry out the campaigns, Anthropic said the companies relied on commercial proxy services that resell access to Claude and other frontier AI models at scale, referred to as hydra cluster architectures.

Back to the basics of AI model training

Industry experts note that the allegations raise a broader and unresolved question around how AI systems are trained. Most large language models, including leading commercial systems, are themselves trained on vast amounts of publicly available internet data, often without explicit consent from original authors.

“Just as many of the foundation models have been built by indexing the vastness of the internet, often without the explicit consent of creators or piggybacking on other search engines’ content, the newer entrants are in many instances going through the same routes of distillation and optimization,” said Neil Shah, vice president at Counterpoint Research. He added that there is a fundamental disagreement, which is mostly legally undefined, about who owns the synthetic data and whether it is okay if it is used for training, especially open models.

Export controls and national security

Anthropic has framed the alleged distillation campaigns partly through a national security lens, arguing that illicitly distilled models could undermine US efforts to control the spread of advanced AI capabilities, especially if influenced by the Chinese Communist Party. However, experts note that current US export controls are largely focused on hardware, and not on large language models.

“It is critical to separate hardware restrictions from service access. US export controls have concentrated primarily on advanced semiconductors, high-performance computing infrastructure, and, in certain regulatory moments, specific categories of advanced AI model weights. There is no universal prohibition on offering API access to large language models in China,” explained Sanchit Vir Gogia, CEO and chief analyst at Greyhound Research.

However, this does not mean developers are insulated. Gogia added that the Bureau of Industry and Security continues to refine licensing frameworks related to advanced computing commodities and high-capability systems. Also, if a company knowingly supports training activity for restricted entities, especially those tied to military or strategic objectives, exposure becomes plausible even without hardware shipment.

To safeguard themselves, many US AI providers already restrict availability in China through business policy and compliance posture, even beyond what is strictly required.

“For developers, the risk is indirect but real: if your product routes access to restricted geographies or entities, facilitates prohibited end uses, or helps others evade provider geo-restrictions, you can trigger account termination, contractual liability, and potentially regulatory scrutiny depending on who the end user is and what the system enables,” said global partner/senior managing director – India at Ankura Consulting.

Implications for teams building with LLMs

For developers building or training models using large language models, the Anthropic allegations highlight a growing grey area. Developers commonly use LLM APIs for application development, testing, or evaluation. But providers are scrutinizing large-scale, automated use of model outputs to train competing systems.

For instance, Anthropic is responding by investing in defensive techniques. For detection, the company has built several classifiers and behavioural fingerprinting systems designed to identify distillation attack patterns in API traffic. It has also strengthened verification for educational accounts, security research programs, and startup organizations, citing them as the pathways most commonly exploited for setting up fraudulent accounts. The company is also implementing product, API, and model-level safeguards designed to reduce the efficacy of model outputs for illicit distillation, without degrading the experience for legitimate customers.

Developers, too, should ensure their model training stays safe, compliant, and defensible.

Jaju stated that, to start with, developers should review API/service terms and assume no training on outputs unless explicitly permitted. They should maintain a clear record of where every training/example item came from, with licensing/terms attached. Separate operational logs from training datasets should be maintained along with set retention limits.

“Geopolitical diligence cannot be an afterthought. Restricted party screening, export compliance reviews, and region-specific access controls are increasingly part of AI governance, especially for enterprises operating across borders,” added Gogia.

Experts say that if questioned by a regulator or acquirer to explain the training pipeline, developers should be able to provide the same with documentation and without caveats.

Read more: Anthropic alleges large-scale distillation campaigns targeting Claude

Story added 24. February 2026, content source with full text you can find at link above.