← ClawMerchants | Agent Skills | Observability | Full Catalog

Agent Resilience Skill — Circuit Breaker & Fault Tolerance Protocol for AI Agents

Name: Agent Resilience Skill — Circuit Breaker & Fault Tolerance Protocol for AI Agents — SKILL.md
Brand: ClawMerchants
Price: 0.05 USDC
Availability: InStock

$0.05 / access SKILL.md protocol

The agent-resilience-skill is a 7-phase SKILL.md behavioral protocol for building fault-tolerant agentic workflows. It teaches agents how to implement circuit breakers that stop cascading failures, configure exponential backoff retry logic with jitter to prevent retry storms, select the right fallback strategy when dependencies fail, track error budgets against SLOs, propagate deadlines across tool call chains, apply graceful degradation patterns, and validate resilience with chaos testing. Compatible with Claude, OpenAI Codex CLI, Cursor, AWS Kiro, AutoGen, CrewAI, and LangGraph.

Why agent resilience is non-negotiable in production: An agent that retries a failing tool call without circuit protection can burn an entire token budget in seconds. A single unhandled timeout in a 10-tool chain can leave the agent in a partial state with irreversible side effects. This protocol gives agents the same resilience primitives that production SRE teams use — applied to LLM tool call graphs.

What's Covered (7 Phases)

Phase	Topic	Key Output
1	Circuit Breaker Implementation	CLOSED/OPEN/HALF_OPEN state machine with failure threshold and recovery probe logic
2	Exponential Backoff Retry Logic	Retry decision tree, idempotency check, jitter formula, total budget cap
3	Fallback Strategy Selection	Hierarchy: Primary → Stale Cache → Degraded Response → Static Default → Fail Gracefully
4	Error Budget Tracking	SLO definition, rolling window error budget tracker, alert thresholds (warn/critical/exhausted)
5	Timeout and Deadline Management	Timeout hierarchy with deadline propagation across orchestrator and sub-tool calls
6	Graceful Degradation Patterns	Degradation level matrix (NORMAL/DEGRADED/CRITICAL/MAINTENANCE) with feature permission map
7	Chaos Testing for Tool Call Chains	Chaos test harness with failure injection scenarios and acceptance criteria checklist

Target Keywords Covered

Circuit breaker for AI agents — CLOSED/OPEN/HALF_OPEN state machine with failure threshold, success threshold, and half-open probe logic
Fault tolerance agent workflow SKILL.md — structured behavioral protocol agents load at runtime via npx skills install
Agent retry protocol — exponential backoff with jitter, idempotency-safe retry classification, total budget cap to prevent infinite retry loops
Resilient agent design — fallback hierarchy, graceful degradation state machine, chaos acceptance criteria checklist
Agent resilience skill — 7-phase protocol covering the full resilience stack from detection to recovery to validation

Code Sample — Circuit Breaker Pattern

// From agent-resilience-skill Phase 1
// Circuit breaker prevents retry storms on failing tool calls

type CircuitState = 'CLOSED' | 'OPEN' | 'HALF_OPEN';

class CircuitBreaker {
  private state: CircuitState = 'CLOSED';
  private failureCount = 0;

  async call<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === 'OPEN') {
      const elapsed = Date.now() - this.lastFailureTime;
      if (elapsed < this.config.timeout) {
        throw new Error(`Circuit OPEN — retry after ${this.config.timeout - elapsed}ms`);
      }
      this.state = 'HALF_OPEN'; // Probe to test recovery
    }
    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure(); // Opens circuit after threshold
      throw error;
    }
  }
}

// One circuit per tool call boundary
const priceFeedBreaker = new CircuitBreaker('price-feed', {
  failureThreshold: 3,   // Open after 3 failures
  successThreshold: 1,   // Close after 1 successful probe
  timeout: 15_000,       // Probe after 15 seconds
});

Use Cases

Multi-tool orchestration agents — wrap every external tool call in a circuit breaker; one flapping API cannot cascade failure to the entire agent
DeFi and payment agents — apply idempotency-safe retry logic before payment confirmation; never double-send due to network jitter
Data ingestion pipelines — fallback from live API to stale cache to static default; agents continue partial operation when upstream is degraded
Long-running autonomous agents — propagate deadlines across tool call chains; inner tool timeouts enforce the outer request SLA
Production agent deployments — track error budgets per tool boundary; trigger load-shedding or degradation mode before SLO breach
Pre-deployment validation — run chaos test harness against all tool call paths; verify graceful failure before agent goes live

How to Install

One-line install via npx skills:

npx skills install agent-resilience-skill

This fetches the SKILL.md protocol and makes it available to your agent runtime. Compatible with Claude Code, OpenAI Codex CLI, Cursor, and any SKILL.md-aware agent framework.

How to Access via x402

Free preview: GET https://clawmerchants.com/v1/preview/agent-resilience-skill — inspect protocol structure before paying
Probe: GET https://clawmerchants.com/v1/data/agent-resilience-skill → HTTP 402 with USDC payment details
Pay: Send $0.05 USDC on Base L2 (chain ID 8453) to the provider address in the 402 response
Receive: Resend with X-PAYMENT: <base64 proof> → HTTP 200 with full 7-phase resilience protocol

# Step 1: Probe — discover payment requirements
curl -i https://clawmerchants.com/v1/data/agent-resilience-skill
# HTTP/1.1 402 Payment Required
# X-Payment-Required: {"amount":"50000","currency":"USDC","chain":8453,...}

# Step 2: Pay + resend with proof
curl -H "X-PAYMENT: <base64-payment-proof>"      https://clawmerchants.com/v1/data/agent-resilience-skill
# HTTP/1.1 200 OK — returns full agent-resilience-skill SKILL.md protocol

Pricing

$0.05 USDC per access — no subscription, no API key, no account required. Purchase once, apply the protocol to every agent you build. The circuit breaker pattern alone prevents retry storms that can burn $5–50 in API costs in a single runaway agent session — a 100-1000× return on the protocol cost.

Install via npx: npx skills install agent-resilience-skill
Free preview: GET /v1/preview/agent-resilience-skill
Probe the endpoint: GET https://clawmerchants.com/v1/data/agent-resilience-skill
Browse all agent skills: Agent Skills Marketplace →