← Blog
explainx / blog

NVIDIA SkillSpector: Security Scanner for AI Agent Skills (2026)

NVIDIA open-sourced SkillSpector, a security scanner for AI agent skills that detects 64 vulnerability patterns across 16 categories — from prompt injection to supply chain attacks. Research found 26.1% of skills contain vulnerabilities.

5 min readYash Thakker
AI SecurityNVIDIAAgent SkillsOpen SourceSupply Chain Security

MDX restores the committed source plus an HTML comment attribution; plain text bundles the rendered markdown body with the explainx.ai attribution footer.

NVIDIA SkillSpector: Security Scanner for AI Agent Skills (2026)

26.1% of AI agent skills contain at least one vulnerability. 5.2% show likely malicious intent. Those numbers come from a large-scale study of 42,447 skills from major marketplaces — and they're why NVIDIA built and open-sourced SkillSpector.

Agent skills — the extensions that Claude Code, Codex CLI, Gemini CLI, and similar tools execute with implicit trust — are a new and largely unvetted attack surface. SkillSpector is a security scanner that fills the gap between "install from the registry" and "run with full system access."


The Problem: Skills Execute With Implicit Trust

When you install an agent skill, it typically runs with the same permissions as the agent itself — which often means file system access, network calls, shell execution, and access to your environment variables. The security model is closer to a browser extension than a sandboxed web app.

Research published in "Agent Skills in the Wild" (Liu et al., 2026) quantified what that means in practice across 42,447 skills:

FindingStat
Skills with at least one vulnerability26.1%
Skills with likely malicious intent5.2%
Multiplier for skills with executable scripts2.12×

One in four skills has a real vulnerability. One in twenty appears intentionally malicious. And the problem compounds as agent skill marketplaces scale.

For the ExplainX perspective on this risk surface, see why agent skills are a security risk and how to vet them.


What SkillSpector Does

SkillSpector runs a two-stage detection pipeline on any skill — git repo, zip, directory, URL, or single file:

Stage 1 — Static Analysis (fast, no API key required)

  • 64 regex-based vulnerability patterns across 16 categories
  • AST behavioral analysis (detects exec(), eval(), subprocess, dynamic imports)
  • Taint tracking (follows data from sources like env vars to sinks like network calls)
  • YARA signatures (malware, webshells, cryptominers, exploit tools)
  • Live CVE lookups via OSV.dev (no key required, auto-fallback when offline)

Stage 2 — LLM Semantic Analysis (optional, ~87% precision)

  • Evaluates context and intent that static analysis misses
  • Filters false positives
  • Produces human-readable explanations for each finding
  • Anti-jailbreak protections prevent malicious skills from manipulating the analysis

The 16 Vulnerability Categories

CategoryPatternsKey Risk
Prompt Injection5Instruction overrides, hidden directives, harmful content
Data Exfiltration4Sending env vars, files, or context to external servers
Privilege Escalation3sudo/root execution, credential access
Supply Chain6Unpinned deps, curl | bash, obfuscated code, known CVEs
Excessive Agency4Unrestricted tool access, autonomous high-impact decisions
Output Handling3Unvalidated output injection, cross-context flows
System Prompt Leakage3Direct/indirect extraction, tool-based exfiltration
Memory Poisoning3Persistent context injection, context window stuffing
Tool Misuse3Parameter abuse, chain abuse, unsafe defaults
Rogue Agent2Self-modification, unauthorized persistence (cron)
Trigger Abuse3Overly broad triggers, shadow commands, keyword baiting
Behavioral AST8exec, eval, __import__, subprocess, dynamic getattr
Taint Tracking5Credential exfiltration chains, file-to-network flows
YARA Signatures4Malware, webshells, cryptominers, hack tools
MCP Least Privilege4Underdeclared capabilities, wildcard permissions
MCP Tool Poisoning4Hidden instructions, Unicode deception, parameter injection

The MCP categories are particularly notable — tool poisoning via Unicode homoglyphs or hidden HTML comments in tool metadata is a real attack vector that most manual reviewers would miss.


Quick Start

# Install
pip install skillspector  # or clone + make install

# Scan a local skill (static analysis, no LLM)
skillspector scan ./my-skill/ --no-llm

# Scan with LLM analysis (Anthropic)
export SKILLSPECTOR_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
skillspector scan ./my-skill/

# Scan a GitHub repo
skillspector scan https://github.com/user/my-skill

# Output as JSON for CI/CD
skillspector scan ./my-skill/ --no-llm --format json --output report.json

# SARIF output for IDE integration
skillspector scan ./my-skill/ --no-llm --format sarif --output report.sarif

Docker (no Python required):

docker build -t skillspector .
docker run --rm -v "$PWD:/scan" skillspector scan ./my-skill/ --no-llm

Risk Scoring

Score  Severity  Action
0–20   LOW       SAFE to install
21–50  MEDIUM    CAUTION — review findings
51–80  HIGH      DO NOT INSTALL
81–100 CRITICAL  DO NOT INSTALL

Points per finding: CRITICAL +50, HIGH +25, MEDIUM +10, LOW +5. Executable scripts multiply the total by 1.3×.


LLM Provider Support

SkillSpector works with any OpenAI-compatible endpoint — which means you can run semantic analysis entirely locally:

ProviderEnv VarDefault Model
openaiOPENAI_API_KEYgpt-5.4
anthropicANTHROPIC_API_KEYclaude-opus-4-6
nv_buildNVIDIA_INFERENCE_KEYdeepseek-ai/deepseek-v4-flash
Local (Ollama, vLLM)OPENAI_API_KEY=ollama + OPENAI_BASE_URLany local model

The default provider is nv_build (NVIDIA's build.nvidia.com inference service). This matters: NVIDIA is both releasing the scanner and providing inference infrastructure for it, which tells you something about how seriously they're treating the agent security problem.


Python API

from skillspector import graph

result = graph.invoke({
    "input_path": "/path/to/skill",
    "output_format": "json",
    "use_llm": True,
})

print(f"Score: {result['risk_score']}/100")
print(f"Severity: {result['risk_severity']}")

for finding in result["filtered_findings"]:
    print(f"[{finding['severity']}] {finding['rule_id']}: {finding['message']}")

The Python API makes it straightforward to integrate SkillSpector into a CI pipeline — scan on PR, fail if score exceeds a threshold, surface findings as annotations.


How It Fits Into the Broader Agent Security Picture

SkillSpector addresses the install-time vetting problem. It doesn't replace runtime sandboxing or permission models — it answers the question "should I install this at all?" before you give it access to anything.

This matters more as agent skill ecosystems scale. The agent-skills-secure-ai-agent-registry model (curated, verified registries) is one approach. SkillSpector is the complementary tool-level approach: scan anything, from any source, before trusting it.

For supply chain security specifically, see our coverage of Bumblebee — Perplexity's open-source supply chain security scanner, which tackles a related problem at the dependency layer rather than the skill layer.

And for teams already using Claude Code, the Claude Code Security-Guidance Plugin handles a different but complementary surface: catching vulnerabilities in code the AI generates, not in the skills it installs.


What to Watch

SkillSpector is at v2.0.0 with 5.5k GitHub stars and active development. Key gaps the project is still working on:

  • Non-English skill content (may miss patterns in other languages)
  • Image-based attacks (text embedded in images is not scanned)
  • Dynamic/runtime behavior (static analysis only — what the code does when it actually runs is a separate problem)

The research foundation is strong. The tool is production-ready for pre-install vetting. The remaining gaps are genuine hard problems, not oversights.

Source: github.com/NVIDIA/SkillSpector — Apache 2.0.

Related posts