tag

agent▌

146 indexed skills · max 10 per page

skills (146)

agent-evaluation

davila7/claude-code-templates · Productivity

Behavioral testing and reliability metrics for LLM agents, catching production failures benchmarks miss. \n \n Covers five core evaluation areas: agent testing, benchmark design, capability assessment, reliability metrics, and regression testing \n Emphasizes statistical test evaluation (multiple runs, result distribution analysis) and behavioral contract testing over single-run or string-matching approaches \n Includes adversarial testing patterns to actively probe agent failure modes and ident

agent-tools

inference-sh/skills · Productivity

Access 150+ cloud-based AI apps via CLI—image generation, video creation, LLMs, search, 3D modeling, and Twitter automation. \n \n Supports major models including FLUX, Veo, Claude, Gemini, Grok, Seedance, OmniHuman, Tavily, and Exa with no local GPU required \n Automatic local file upload for images, audio, and video inputs; run apps synchronously or asynchronously with task status tracking \n Covers six capability categories: image generation, video generation, LLM inference, web search, 3D mo

agent-teams-simplify-and-harden

pskoett/pskoett-ai-skills · Productivity

A two-phase team loop that produces production-quality code: implement, then audit using simplify + harden passes, then fix audit findings, then re-audit, repeating until the codebase is solid or the loop cap is reached.

agent-browser

everyinc/compound-engineering-plugin · Productivity

The CLI uses Chrome/Chromium via CDP directly. Install via npm i -g agent-browser, brew install agent-browser, or cargo install agent-browser. Run agent-browser install to download Chrome. Run agent-browser upgrade to update to the latest version.

claude-agent-sdk

jezweb/claude-skills · AI/ML

$22

agent-browser

jezweb/claude-skills · Productivity

Every browser automation follows this pattern:

sub-agent-patterns

jezweb/claude-skills · Productivity

Delegate specialized tasks to isolated AI assistants with custom tools, models, and system prompts. \n \n Sub-agents preserve main context by isolating verbose tool outputs and intermediate reasoning, enabling longer sessions and cleaner conversations \n Three built-in agents available: Explore (Haiku, read-only codebase search), Plan (Sonnet, plan-mode research), and General-Purpose (Sonnet, full read/write access) \n Create custom agents in .claude/agents/ with YAML frontmatter and markdown pr

cloudbase-agent-ts

tencentcloudbase/skills · Cloud

TypeScript SDK for deploying AI agents as HTTP services with AG-UI protocol support. \n \n Supports three adapter patterns: LangGraph for stateful graph-based workflows, LangChain for chain-based agents, and custom adapters via AbstractAgent interface \n Includes @cloudbase/agent-server for HTTP service deployment with built-in CORS, logging, and observability configuration \n Provides UI client libraries for web applications ( @ag-ui/client ) and WeChat Mini Programs ( @cloudbase/agent-ui-minip

agent-evaluation

sickn33/antigravity-awesome-skills · Productivity

Framework for testing LLM agents across behavioral, capability, and reliability dimensions with production-focused evaluation patterns. \n \n Covers five core evaluation areas: agent testing, benchmark design, capability assessment, reliability metrics, and regression testing \n Emphasizes statistical test evaluation (multiple runs with distribution analysis) and behavioral contract testing over single-run or string-matching approaches \n Includes adversarial testing patterns and guards against

agent-tool-builder

sickn33/antigravity-awesome-skills · Frontend

Design LLM-facing tool schemas that prevent hallucination, silent failures, and token waste. \n \n Focuses on JSON Schema design, input examples, and error handling patterns that help LLMs use tools correctly \n Emphasizes explicit documentation and clear descriptions over implementation details, since LLMs only see the schema \n Identifies anti-patterns like vague descriptions, silent failures, and tool overload that cause agent failures \n Covers function-calling, MCP tools, and tool validatio

prevpage 9 / 15next