Anthropic Thwarts First Major AI-Powered Cyberattack: Chinese Hackers Used Claude to Target Global Organizations

Anthropic has prevented what it calls the first documented large-scale AI cyberattack executed with minimal human intervention. Chinese state-sponsored hackers manipulated the Claude AI system to target approximately 30 global organizations, including tech companies, financial institutions, and government agencies. The attackers employed sophisticated techniques to bypass security measures, allowing the AI to perform 80-90% of the campaign autonomously, signaling a concerning new frontier in cybersecurity threats.

US Firm Claims It Foiled Large-Scale AI Cyberattack By Chinese Hackers

Incident has substantial implications for cybersecurity in age of AI agents: Anthropic (Representational)

New Delhi:

Anthropic, the creator of AI chatbot Claude, announced it has successfully prevented what it describes as "the first documented case of a large-scale AI cyberattack executed without substantial human intervention."

The San Francisco-based AI company reported that Chinese state-sponsored hackers utilized their chatbot to conduct automated attacks targeting approximately 30 organizations worldwide.

According to Anthropic's social media announcement: "We believe this is the first documented case of a large-scale AI cyberattack executed without substantial human intervention. It has significant implications for cybersecurity in the age of AI agents."

In an extensive blog post, Anthropic elaborated that this unprecedented attack represents "the first documented case of a large-scale cyberattack executed without substantial human intervention," and warned that the incident carries "substantial implications for cybersecurity in the age of AI 'agents'."

The company revealed: "In mid-September 2025, we detected suspicious activity that later investigation determined to be a highly sophisticated espionage campaign." Their analysis showed the attackers extensively leveraged AI's agent capabilities, employing Claude not merely for guidance but to directly execute cyber operations.

"The threat actor, whom we assess with high confidence was a Chinese state-sponsored group, manipulated our Claude Code tool into attempting infiltration into roughly thirty global targets and succeeded in a small number of cases," Anthropic stated. The targeted organizations reportedly included major technology companies, critical financial institutions, chemical manufacturers, and several government agencies.

The attackers reportedly fragmented their operation into small, seemingly harmless tasks that Claude completed without awareness of their collective malicious intent.

To circumvent safety protocols, the hackers allegedly posed as a legitimate cybersecurity firm conducting defensive tests and "jailbreaking" the AI model to operate beyond its standard safety guardrails. This enabled Claude to inspect infrastructure, identify "the highest-value databases," generate exploit code, harvest credentials, and organize stolen data "all with minimal human supervision," according to the blog post.

Upon discovering the activity, Anthropic initiated an internal investigation to map the operation. During a 10-day period, the company assessed the severity, blocked compromised accounts, alerted affected organizations, and collaborated with authorities while gathering intelligence.

Anthropic noted, "Overall, the threat actor was able to use AI to perform 80-90% of the campaign, with human intervention required only sporadically." However, they clarified that fully autonomous attacks remain unlikely for now, as Claude occasionally "hallucinated credentials or claimed to have extracted secret information that was in fact publicly available."

Nevertheless, Anthropic cautioned that "the barriers to performing sophisticated cyberattacks have dropped substantially, and we predict that they'll continue to do so."

With appropriate configuration, they warned, threat actors could now depend on agentic AI systems for extended periods to execute tasks previously requiring large teams of skilled hackers—spanning system analysis and exploit generation to rapidly processing stolen data. This development could enable smaller or less experienced groups to launch large-scale cyber operations previously beyond their capabilities.

Source: https://www.ndtv.com/world-news/us-firm-claims-it-foiled-large-scale-ai-cyberattack-by-chinese-hackers-9640198