Third-Party Claude Skills Could Backdoor Your Code — Here’s How

2 minute read

What Are Skills?

Claude Skills are markdown instruction files (SKILL.md) that tell Claude how to perform specific tasks — which libraries to import, what code patterns to follow, what templates to use, and even which bash commands to run. Anthropic ships vetted first-party skills, but the architecture also supports third-party and user-provided skills. When Claude reads a skill file, it follows its instructions with high trust. That trust is exactly what makes malicious skills so dangerous.

This Already Happened: The ClawHavoc Campaign

In February 2026, researchers at Koi Security audited ClawHub — the skill marketplace for the OpenClaw AI agent — and found 341 malicious skills out of 2,857 total. 335 of them were traced to a single coordinated campaign now known as ClawHavoc (eSecurity Planet). By March 2026, the number had grown to over 1,184 confirmed malicious skills across 10,700+ packages (Admin By Request).

Snyk’s ToxicSkills audit scanned 3,984 skills and found prompt injection in 36% of them, plus 76 confirmed malicious payloads designed for credential theft, backdoor installation, and data exfiltration (Snyk). The barrier to publishing a malicious skill? A markdown file and a one-week-old GitHub account.

Critically, Repello AI noted that these attack techniques work against any AI agent platform that treats third-party skill files as trusted instructions — including Claude Code and Cursor (Repello AI).

How the Attacks Work

Malware via fake setup steps. The most common ClawHavoc technique embedded malicious install instructions inside SKILL.md files. According to Trend Micro, the skills manipulated AI agents into presenting fake setup requirements that ultimately delivered Atomic macOS Stealer (AMOS), harvesting browser credentials, keychains, SSH keys, and crypto wallets (Trend Micro).

Remote payload fetching. Nearly 3% of ClawHub skills dynamically fetched executable content from external servers at runtime. The published skill looks clean during review, but the attacker controls the payload it pulls down (Snyk).

CLAUDE.md poisoning. In April 2026, Adversa AI found that Claude Code’s permission system skips all deny-rule enforcement when a command exceeds 50 subcommands. A malicious CLAUDE.md in a cloned repo could instruct Claude to build a 50+ command pipeline, hiding a payload — like credential exfiltration — past the security cutoff (SecurityWeek). Anthropic patched this in v2.1.90 (The Register).

Context poisoning. Straiker’s analysis explained that the AI model itself isn’t malicious — it’s the context that gets weaponized. A poisoned instruction can survive context compaction, and because the agent has access to the entire file system and shell, the blast radius is the whole workstation (Straiker).

Protect Yourself

Read every third-party SKILL.md before installing — treat it like a shell script from a stranger.
Scan skills with tools like Snyk’s mcp-scan (Snyk) or SkillRisk (skillrisk.org).
Audit CLAUDE.md files in any cloned repository before letting Claude Code read them.
Keep tools updated — known vulnerabilities have patches available.
Run agents in sandboxed environments with restricted network access and rotate any credentials that may have been exposed.

Skills are a smart abstraction. But any system that executes third-party instructions with high trust is, by definition, a supply-chain attack surface. The ClawHavoc campaign proved this isn’t theoretical. Treat third-party skills accordingly.

This post was written with the help of an AI agent.

What Are Skills?

This Already Happened: The ClawHavoc Campaign

How the Attacks Work

Protect Yourself

Leave a Comment Cancel reply