Official Anthropic skill for building, modifying, testing, and benchmarking other Claude Code skills. Installs from the plugin marketplace and encodes Anthropic’s own best practices (previously distributed as a 33-page PDF). Runs evaluations against example inputs to catch regressions or detect when a model has improved enough to deprecate a skill, benchmarks pass rate, time, and tokens with and without the skill, and tunes YAML descriptions for better trigger accuracy.
Key facts
- Official Anthropic skill available via /plugins marketplace
- Runs evals against example inputs and outputs
- Benchmarks pass rate, total time, and tokens with vs without the skill
- Performs trigger tuning — rewrites skill descriptions to reduce misfires
- Catches regressions when models change, and spots growth (skills the model has outgrown)
- Encodes Anthropic’s 33-page skill best-practices PDF
Source references
- [src-004] Nate Herk cluster — Nate Herk — Claude Code cluster (21 videos)
– Videos referenced: RAZVk5NPNtE, zKBPwDpBfhs
Keep reading from this thread
From 494 indexed pages and articles.
- Wiki concept Claude Code Skills Reusable instruction packages that live under .claude/skills//SKILL.md, with optional reference files and scripts. Related by skill
- Insight AI Measurement and Experimentation How to measure AI product impact with evals, adoption metrics, online experiments, guardrails, and cost tracking Readers have engaged with this next
- Wiki concept trigger.dev Background job platform used with Claude Code to host agents and workflows in production. Provides scheduled runs, automatic retries, queuing, orchestration, and a clean UI Related by runs