What is SkillSpector?

SkillSpector is an open-source security scanner from NVIDIA that analyzes AI agent skills (used by Claude Code, Codex CLI, Gemini CLI, and similar tools) for vulnerabilities before installation. It detects 64 vulnerability patterns across 16 categories using two-stage analysis: static pattern matching plus optional LLM semantic evaluation.

How many vulnerability patterns does SkillSpector detect?

SkillSpector detects 64 vulnerability patterns across 16 categories: prompt injection, data exfiltration, privilege escalation, supply chain, excessive agency, output handling, system prompt leakage, memory poisoning, tool misuse, rogue agent, trigger abuse, dangerous code (AST analysis), taint tracking, YARA signatures, MCP least privilege, and MCP tool poisoning.

What percentage of AI agent skills contain vulnerabilities?

According to research from "Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale" (Liu et al., 2026), analyzing 42,447 skills: 26.1% contain at least one vulnerability, and 5.2% show likely malicious intent. Skills with executable scripts are 2.12x more likely to be vulnerable.

Does SkillSpector require an LLM API key to work?

No. Static analysis (fast regex, AST, YARA, taint tracking) works without any API key using --no-llm. LLM semantic analysis is optional and improves precision to ~87% by filtering false positives and providing explanations. It supports OpenAI, Anthropic (Claude), NVIDIA build.nvidia.com, and any OpenAI-compatible local server like Ollama.

How does SkillSpector score risk?

Risk score is 0-100: CRITICAL issues add 50 points, HIGH adds 25, MEDIUM adds 10, LOW adds 5. Executable scripts apply a 1.3x multiplier. Scores 0-20 are LOW/SAFE; 21-50 MEDIUM/CAUTION; 51-80 HIGH/DO NOT INSTALL; 81-100 CRITICAL/DO NOT INSTALL.

NVIDIA SkillSpector: AI Agent Skill Security Scanner — 2026 Guide | explainx.ai Blog

26.1% of AI agent skills contain at least one vulnerability. 5.2% show likely malicious intent. Those numbers come from a large-scale study of 42,447 skills from major marketplaces — and they're why NVIDIA built and open-sourced SkillSpector.

Agent skills — the extensions that Claude Code, Codex CLI, Gemini CLI, and similar tools execute with implicit trust — are a new and largely unvetted attack surface. SkillSpector is a security scanner that fills the gap between "install from the registry" and "run with full system access."

The Problem: Skills Execute With Implicit Trust

When you install an agent skill, it typically runs with the same permissions as the agent itself — which often means file system access, network calls, shell execution, and access to your environment variables. The security model is closer to a browser extension than a sandboxed web app.

Research published in "Agent Skills in the Wild" (Liu et al., 2026) quantified what that means in practice across 42,447 skills:

Finding	Stat
Skills with at least one vulnerability	26.1%
Skills with likely malicious intent	5.2%
Multiplier for skills with executable scripts	2.12×

One in four skills has a real vulnerability. One in twenty appears intentionally malicious. And the problem compounds as agent skill marketplaces scale.

For the explainx.ai perspective on this risk surface, see why agent skills are a security risk and how to vet them.

What SkillSpector Does

SkillSpector runs a two-stage detection pipeline on any skill — git repo, zip, directory, URL, or single file:

Stage 1 — Static Analysis (fast, no API key required)

64 regex-based vulnerability patterns across 16 categories
AST behavioral analysis (detects exec(), eval(), subprocess, dynamic imports)
Taint tracking (follows data from sources like env vars to sinks like network calls)
YARA signatures (malware, webshells, cryptominers, exploit tools)
Live CVE lookups via OSV.dev (no key required, auto-fallback when offline)

Stage 2 — LLM Semantic Analysis (optional, ~87% precision)

Evaluates context and intent that static analysis misses
Filters false positives
Produces human-readable explanations for each finding
Anti-jailbreak protections prevent malicious skills from manipulating the analysis

The 16 Vulnerability Categories

Category	Patterns	Key Risk
Prompt Injection	5	Instruction overrides, hidden directives, harmful content
Data Exfiltration	4	Sending env vars, files, or context to external servers
Privilege Escalation	3	sudo/root execution, credential access
Supply Chain	6	Unpinned deps, `curl \| bash`, obfuscated code, known CVEs
Excessive Agency	4	Unrestricted tool access, autonomous high-impact decisions
Output Handling	3	Unvalidated output injection, cross-context flows
System Prompt Leakage	3	Direct/indirect extraction, tool-based exfiltration
Memory Poisoning	3	Persistent context injection, context window stuffing
Tool Misuse	3	Parameter abuse, chain abuse, unsafe defaults
Rogue Agent	2	Self-modification, unauthorized persistence (cron)
Trigger Abuse	3	Overly broad triggers, shadow commands, keyword baiting
Behavioral AST	8	`exec`, `eval`, `__import__`, subprocess, dynamic `getattr`
Taint Tracking	5	Credential exfiltration chains, file-to-network flows
YARA Signatures	4	Malware, webshells, cryptominers, hack tools
MCP Least Privilege	4	Underdeclared capabilities, wildcard permissions
MCP Tool Poisoning	4	Hidden instructions, Unicode deception, parameter injection

The MCP categories are particularly notable — tool poisoning via Unicode homoglyphs or hidden HTML comments in tool metadata is a real attack vector that most manual reviewers would miss.

Quick Start

bash

# Install
pip install skillspector  # or clone + make install

# Scan a local skill (static analysis, no LLM)
skillspector scan ./my-skill/ --no-llm

# Scan with LLM analysis (Anthropic)
export SKILLSPECTOR_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
skillspector scan ./my-skill/

# Scan a GitHub repo
skillspector scan https://github.com/user/my-skill

# Output as JSON for CI/CD
skillspector scan ./my-skill/ --no-llm --format json --output report.json

# SARIF output for IDE integration
skillspector scan ./my-skill/ --no-llm --format sarif --output report.sarif

Docker (no Python required):

bash

docker build -t skillspector .
docker run --rm -v "$PWD:/scan" skillspector scan ./my-skill/ --no-llm

Risk Scoring

snippet

Score  Severity  Action
0–20   LOW       SAFE to install
21–50  MEDIUM    CAUTION — review findings
51–80  HIGH      DO NOT INSTALL
81–100 CRITICAL  DO NOT INSTALL

Points per finding: CRITICAL +50, HIGH +25, MEDIUM +10, LOW +5. Executable scripts multiply the total by 1.3×.

LLM Provider Support

SkillSpector works with any OpenAI-compatible endpoint — which means you can run semantic analysis entirely locally:

Provider	Env Var	Default Model
`openai`	`OPENAI_API_KEY`	gpt-5.4
`anthropic`	`ANTHROPIC_API_KEY`	claude-opus-4-6
`nv_build`	`NVIDIA_INFERENCE_KEY`	deepseek-ai/deepseek-v4-flash
Local (Ollama, vLLM)	`OPENAI_API_KEY=ollama` + `OPENAI_BASE_URL`	any local model

The default provider is nv_build (NVIDIA's build.nvidia.com inference service). This matters: NVIDIA is both releasing the scanner and providing inference infrastructure for it, which tells you something about how seriously they're treating the agent security problem.

Python API

python

from skillspector import graph

result = graph.invoke({
    "input_path": "/path/to/skill",
    "output_format": "json",
    "use_llm": True,
})

print(f"Score: {result['risk_score']}/100")
print(f"Severity: {result['risk_severity']}")

for finding in result["filtered_findings"]:
    print(f"[{finding['severity']}] {finding['rule_id']}: {finding['message']}")

The Python API makes it straightforward to integrate SkillSpector into a CI pipeline — scan on PR, fail if score exceeds a threshold, surface findings as annotations.

How It Fits Into the Broader Agent Security Picture

SkillSpector addresses the install-time vetting problem. It doesn't replace runtime sandboxing or permission models — it answers the question "should I install this at all?" before you give it access to anything.

This matters more as agent skill ecosystems scale. The agent-skills-secure-ai-agent-registry model (curated, verified registries) is one approach. SkillSpector is the complementary tool-level approach: scan anything, from any source, before trusting it.

For supply chain security specifically, see our coverage of Bumblebee — Perplexity's open-source supply chain security scanner, which tackles a related problem at the dependency layer rather than the skill layer.

And for teams already using Claude Code, the Claude Code Security-Guidance Plugin handles a different but complementary surface: catching vulnerabilities in code the AI generates, not in the skills it installs.

What to Watch

SkillSpector is at v2.0.0 with 5.5k GitHub stars and active development. Key gaps the project is still working on:

Non-English skill content (may miss patterns in other languages)
Image-based attacks (text embedded in images is not scanned)
Dynamic/runtime behavior (static analysis only — what the code does when it actually runs is a separate problem)

The research foundation is strong. The tool is production-ready for pre-install vetting. The remaining gaps are genuine hard problems, not oversights.

Source: github.com/NVIDIA/SkillSpector — Apache 2.0.

The Problem: Skills Execute With Implicit Trust

Research published in "Agent Skills in the Wild" (Liu et al., 2026) quantified what that means in practice across 42,447 skills:

Finding	Stat
Skills with at least one vulnerability	26.1%
Skills with likely malicious intent	5.2%
Multiplier for skills with executable scripts	2.12×

One in four skills has a real vulnerability. One in twenty appears intentionally malicious. And the problem compounds as agent skill marketplaces scale.

For the explainx.ai perspective on this risk surface, see why agent skills are a security risk and how to vet them.

What SkillSpector Does

SkillSpector runs a two-stage detection pipeline on any skill — git repo, zip, directory, URL, or single file:

Stage 1 — Static Analysis (fast, no API key required)

64 regex-based vulnerability patterns across 16 categories
AST behavioral analysis (detects exec(), eval(), subprocess, dynamic imports)
Taint tracking (follows data from sources like env vars to sinks like network calls)
YARA signatures (malware, webshells, cryptominers, exploit tools)
Live CVE lookups via OSV.dev (no key required, auto-fallback when offline)

Stage 2 — LLM Semantic Analysis (optional, ~87% precision)

Evaluates context and intent that static analysis misses
Filters false positives
Produces human-readable explanations for each finding
Anti-jailbreak protections prevent malicious skills from manipulating the analysis

The 16 Vulnerability Categories

Category	Patterns	Key Risk
Prompt Injection	5	Instruction overrides, hidden directives, harmful content
Data Exfiltration	4	Sending env vars, files, or context to external servers
Privilege Escalation	3	sudo/root execution, credential access
Supply Chain	6	Unpinned deps, `curl \| bash`, obfuscated code, known CVEs
Excessive Agency	4	Unrestricted tool access, autonomous high-impact decisions
Output Handling	3	Unvalidated output injection, cross-context flows
System Prompt Leakage	3	Direct/indirect extraction, tool-based exfiltration
Memory Poisoning	3	Persistent context injection, context window stuffing
Tool Misuse	3	Parameter abuse, chain abuse, unsafe defaults
Rogue Agent	2	Self-modification, unauthorized persistence (cron)
Trigger Abuse	3	Overly broad triggers, shadow commands, keyword baiting
Behavioral AST	8	`exec`, `eval`, `__import__`, subprocess, dynamic `getattr`
Taint Tracking	5	Credential exfiltration chains, file-to-network flows
YARA Signatures	4	Malware, webshells, cryptominers, hack tools
MCP Least Privilege	4	Underdeclared capabilities, wildcard permissions
MCP Tool Poisoning	4	Hidden instructions, Unicode deception, parameter injection

The MCP categories are particularly notable — tool poisoning via Unicode homoglyphs or hidden HTML comments in tool metadata is a real attack vector that most manual reviewers would miss.

Quick Start

bash

# Install
pip install skillspector  # or clone + make install

# Scan a local skill (static analysis, no LLM)
skillspector scan ./my-skill/ --no-llm

# Scan with LLM analysis (Anthropic)
export SKILLSPECTOR_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
skillspector scan ./my-skill/

# Scan a GitHub repo
skillspector scan https://github.com/user/my-skill

# Output as JSON for CI/CD
skillspector scan ./my-skill/ --no-llm --format json --output report.json

# SARIF output for IDE integration
skillspector scan ./my-skill/ --no-llm --format sarif --output report.sarif

Docker (no Python required):

bash

docker build -t skillspector .
docker run --rm -v "$PWD:/scan" skillspector scan ./my-skill/ --no-llm

Risk Scoring

snippet

Score  Severity  Action
0–20   LOW       SAFE to install
21–50  MEDIUM    CAUTION — review findings
51–80  HIGH      DO NOT INSTALL
81–100 CRITICAL  DO NOT INSTALL

Points per finding: CRITICAL +50, HIGH +25, MEDIUM +10, LOW +5. Executable scripts multiply the total by 1.3×.

LLM Provider Support

SkillSpector works with any OpenAI-compatible endpoint — which means you can run semantic analysis entirely locally:

Provider	Env Var	Default Model
`openai`	`OPENAI_API_KEY`	gpt-5.4
`anthropic`	`ANTHROPIC_API_KEY`	claude-opus-4-6
`nv_build`	`NVIDIA_INFERENCE_KEY`	deepseek-ai/deepseek-v4-flash
Local (Ollama, vLLM)	`OPENAI_API_KEY=ollama` + `OPENAI_BASE_URL`	any local model

Python API

python

from skillspector import graph

result = graph.invoke({
    "input_path": "/path/to/skill",
    "output_format": "json",
    "use_llm": True,
})

print(f"Score: {result['risk_score']}/100")
print(f"Severity: {result['risk_severity']}")

for finding in result["filtered_findings"]:
    print(f"[{finding['severity']}] {finding['rule_id']}: {finding['message']}")

The Python API makes it straightforward to integrate SkillSpector into a CI pipeline — scan on PR, fail if score exceeds a threshold, surface findings as annotations.

How It Fits Into the Broader Agent Security Picture

What to Watch

SkillSpector is at v2.0.0 with 5.5k GitHub stars and active development. Key gaps the project is still working on:

Non-English skill content (may miss patterns in other languages)
Image-based attacks (text embedded in images is not scanned)
Dynamic/runtime behavior (static analysis only — what the code does when it actually runs is a separate problem)

The research foundation is strong. The tool is production-ready for pre-install vetting. The remaining gaps are genuine hard problems, not oversights.

Source: github.com/NVIDIA/SkillSpector — Apache 2.0.

NVIDIA SkillSpector: Security Scanner for AI Agent Skills (2026)

The Problem: Skills Execute With Implicit Trust

What SkillSpector Does

The 16 Vulnerability Categories

Quick Start

Risk Scoring

LLM Provider Support

Python API

How It Fits Into the Broader Agent Security Picture

What to Watch

NVIDIA SkillSpector: Security Scanner for AI Agent Skills (2026)

The Problem: Skills Execute With Implicit Trust

What SkillSpector Does

The 16 Vulnerability Categories

Quick Start

Risk Scoring

LLM Provider Support

Python API

How It Fits Into the Broader Agent Security Picture

What to Watch

Related posts

Peter Yang Open-Sources /no-ai-slop, a Claude Skill for De-Sloppifying Writing

code-review-graph: Stop AI Coding Agents From Re-Reading Your Whole Repo

AI Found 7 Bugs in Cloudflare CIRCL: What zkSecurity's zkao Audit Reveals

Related posts

Peter Yang Open-Sources /no-ai-slop, a Claude Skill for De-Sloppifying Writing

code-review-graph: Stop AI Coding Agents From Re-Reading Your Whole Repo

AI Found 7 Bugs in Cloudflare CIRCL: What zkSecurity's zkao Audit Reveals