26.1% of AI agent skills contain at least one vulnerability. 5.2% show likely malicious intent. Those numbers come from a large-scale study of 42,447 skills from major marketplaces — and they're why NVIDIA built and open-sourced SkillSpector.
Agent skills — the extensions that Claude Code, Codex CLI, Gemini CLI, and similar tools execute with implicit trust — are a new and largely unvetted attack surface. SkillSpector is a security scanner that fills the gap between "install from the registry" and "run with full system access."
The Problem: Skills Execute With Implicit Trust
When you install an agent skill, it typically runs with the same permissions as the agent itself — which often means file system access, network calls, shell execution, and access to your environment variables. The security model is closer to a browser extension than a sandboxed web app.
Research published in "Agent Skills in the Wild" (Liu et al., 2026) quantified what that means in practice across 42,447 skills:
| Finding | Stat |
|---|---|
| Skills with at least one vulnerability | 26.1% |
| Skills with likely malicious intent | 5.2% |
| Multiplier for skills with executable scripts | 2.12× |
One in four skills has a real vulnerability. One in twenty appears intentionally malicious. And the problem compounds as agent skill marketplaces scale.
For the ExplainX perspective on this risk surface, see why agent skills are a security risk and how to vet them.
What SkillSpector Does
SkillSpector runs a two-stage detection pipeline on any skill — git repo, zip, directory, URL, or single file:
Stage 1 — Static Analysis (fast, no API key required)
- 64 regex-based vulnerability patterns across 16 categories
- AST behavioral analysis (detects
exec(),eval(),subprocess, dynamic imports) - Taint tracking (follows data from sources like env vars to sinks like network calls)
- YARA signatures (malware, webshells, cryptominers, exploit tools)
- Live CVE lookups via OSV.dev (no key required, auto-fallback when offline)
Stage 2 — LLM Semantic Analysis (optional, ~87% precision)
- Evaluates context and intent that static analysis misses
- Filters false positives
- Produces human-readable explanations for each finding
- Anti-jailbreak protections prevent malicious skills from manipulating the analysis
The 16 Vulnerability Categories
| Category | Patterns | Key Risk |
|---|---|---|
| Prompt Injection | 5 | Instruction overrides, hidden directives, harmful content |
| Data Exfiltration | 4 | Sending env vars, files, or context to external servers |
| Privilege Escalation | 3 | sudo/root execution, credential access |
| Supply Chain | 6 | Unpinned deps, curl | bash, obfuscated code, known CVEs |
| Excessive Agency | 4 | Unrestricted tool access, autonomous high-impact decisions |
| Output Handling | 3 | Unvalidated output injection, cross-context flows |
| System Prompt Leakage | 3 | Direct/indirect extraction, tool-based exfiltration |
| Memory Poisoning | 3 | Persistent context injection, context window stuffing |
| Tool Misuse | 3 | Parameter abuse, chain abuse, unsafe defaults |
| Rogue Agent | 2 | Self-modification, unauthorized persistence (cron) |
| Trigger Abuse | 3 | Overly broad triggers, shadow commands, keyword baiting |
| Behavioral AST | 8 | exec, eval, __import__, subprocess, dynamic getattr |
| Taint Tracking | 5 | Credential exfiltration chains, file-to-network flows |
| YARA Signatures | 4 | Malware, webshells, cryptominers, hack tools |
| MCP Least Privilege | 4 | Underdeclared capabilities, wildcard permissions |
| MCP Tool Poisoning | 4 | Hidden instructions, Unicode deception, parameter injection |
The MCP categories are particularly notable — tool poisoning via Unicode homoglyphs or hidden HTML comments in tool metadata is a real attack vector that most manual reviewers would miss.
Quick Start
# Install
pip install skillspector # or clone + make install
# Scan a local skill (static analysis, no LLM)
skillspector scan ./my-skill/ --no-llm
# Scan with LLM analysis (Anthropic)
export SKILLSPECTOR_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
skillspector scan ./my-skill/
# Scan a GitHub repo
skillspector scan https://github.com/user/my-skill
# Output as JSON for CI/CD
skillspector scan ./my-skill/ --no-llm --format json --output report.json
# SARIF output for IDE integration
skillspector scan ./my-skill/ --no-llm --format sarif --output report.sarif
Docker (no Python required):
docker build -t skillspector .
docker run --rm -v "$PWD:/scan" skillspector scan ./my-skill/ --no-llm
Risk Scoring
Score Severity Action
0–20 LOW SAFE to install
21–50 MEDIUM CAUTION — review findings
51–80 HIGH DO NOT INSTALL
81–100 CRITICAL DO NOT INSTALL
Points per finding: CRITICAL +50, HIGH +25, MEDIUM +10, LOW +5. Executable scripts multiply the total by 1.3×.
LLM Provider Support
SkillSpector works with any OpenAI-compatible endpoint — which means you can run semantic analysis entirely locally:
| Provider | Env Var | Default Model |
|---|---|---|
openai | OPENAI_API_KEY | gpt-5.4 |
anthropic | ANTHROPIC_API_KEY | claude-opus-4-6 |
nv_build | NVIDIA_INFERENCE_KEY | deepseek-ai/deepseek-v4-flash |
| Local (Ollama, vLLM) | OPENAI_API_KEY=ollama + OPENAI_BASE_URL | any local model |
The default provider is nv_build (NVIDIA's build.nvidia.com inference service). This matters: NVIDIA is both releasing the scanner and providing inference infrastructure for it, which tells you something about how seriously they're treating the agent security problem.
Python API
from skillspector import graph
result = graph.invoke({
"input_path": "/path/to/skill",
"output_format": "json",
"use_llm": True,
})
print(f"Score: {result['risk_score']}/100")
print(f"Severity: {result['risk_severity']}")
for finding in result["filtered_findings"]:
print(f"[{finding['severity']}] {finding['rule_id']}: {finding['message']}")
The Python API makes it straightforward to integrate SkillSpector into a CI pipeline — scan on PR, fail if score exceeds a threshold, surface findings as annotations.
How It Fits Into the Broader Agent Security Picture
SkillSpector addresses the install-time vetting problem. It doesn't replace runtime sandboxing or permission models — it answers the question "should I install this at all?" before you give it access to anything.
This matters more as agent skill ecosystems scale. The agent-skills-secure-ai-agent-registry model (curated, verified registries) is one approach. SkillSpector is the complementary tool-level approach: scan anything, from any source, before trusting it.
For supply chain security specifically, see our coverage of Bumblebee — Perplexity's open-source supply chain security scanner, which tackles a related problem at the dependency layer rather than the skill layer.
And for teams already using Claude Code, the Claude Code Security-Guidance Plugin handles a different but complementary surface: catching vulnerabilities in code the AI generates, not in the skills it installs.
What to Watch
SkillSpector is at v2.0.0 with 5.5k GitHub stars and active development. Key gaps the project is still working on:
- Non-English skill content (may miss patterns in other languages)
- Image-based attacks (text embedded in images is not scanned)
- Dynamic/runtime behavior (static analysis only — what the code does when it actually runs is a separate problem)
The research foundation is strong. The tool is production-ready for pre-install vetting. The remaining gaps are genuine hard problems, not oversights.
Source: github.com/NVIDIA/SkillSpector — Apache 2.0.