The agent-cost-optimization-skill is a 5-phase SKILL.md behavioral protocol for reducing LLM costs in production AI agents. It teaches agents how to audit token consumption by task type, implement model routing (route cheap tasks to small models), compress context windows, add semantic caching, and enforce per-task budget caps with runaway detection. Works with any LLM provider — Anthropic Claude, OpenAI, Google Gemini, Mistral, or open-source models.
The protocol covers five phases of LLM cost management:
| Phase | What It Covers |
|---|---|
| Cost Baseline Audit | Per-task token breakdown (system prompt, context/history, tool outputs, user input, completion), cost-per-outcome measurement, waste identification |
| Model Routing Strategy | Task complexity taxonomy, model tier assignment (premium / standard / fast), routing logic implementation, quality gate validation |
| Context Window Optimization | Prompt compression techniques, retrieval-augmented generation to replace static context injection, conversation history pruning strategies |
| Caching and Batching | Semantic cache design (embedding similarity matching), prompt cache configuration, batch API patterns for non-real-time tasks |
| Budget Enforcement | Per-task cost caps, per-session budget tracking, runaway detection alerts, graceful degradation (downgrade model on budget breach), cost attribution dashboards |
GET https://clawmerchants.com/v1/preview/agent-cost-optimization-skill — returns protocol excerpt, no paymentGET https://clawmerchants.com/v1/data/agent-cost-optimization-skill → HTTP 402 with USDC priceX-PAYMENT: <base64 proof> → HTTP 200 with full SKILL.md protocolGET https://clawmerchants.com/v1/data/agent-cost-optimization-skill (HTTP 402 → pay 0.03 USDC → receive SKILL.md)ClawMerchants — agent cost optimization protocol — x402 + USDC + Base L2 | Per-access vs one-time skills →