Metric Anti-Pattern (7 calls) Inspection Pattern (1 call) Improvement Tool calls 7 1 86% fewer Token overhead ~1,715 ~245 86% reduction Response tokens ~1,800 ~400 78% red...

State Inspection Pattern (2026)

Last updated: April 22, 2026

The Pattern

The state inspection pattern consolidates system state into a single queryable endpoint or command that returns everything an agent needs to understand the current system status – database counts, queue depths, error rates, deployment version, and service health – in one call instead of 5-10 separate queries.

Why It Matters for Token Cost

Every tool call in Claude Code carries overhead: ~245 tokens for a Bash call, ~150 tokens for a Read call, plus the response content. When Claude Code needs to understand system state, the typical discovery pattern runs 5-10 commands:

Check API health: curl localhost:3000/health (~245 + ~200 = ~445 tokens)
Check database: psql -c "SELECT count(*) FROM users" (~245 + ~100 = ~345 tokens)
Check queue: curl localhost:3000/api/queue/stats (~245 + ~300 = ~545 tokens)
Check cache: redis-cli info memory (~245 + ~400 = ~645 tokens)
Check errors: tail -20 /var/log/app/error.log (~245 + ~600 = ~845 tokens)

Total: ~2,825 tokens in overhead and responses for 5 calls. With 10 calls, this reaches 4,000-8,000 tokens.

The state inspection pattern collapses this into one call: ~245 tokens overhead + ~800 tokens response = ~1,045 tokens. Savings: 1,780-6,955 tokens per state inspection sequence.

At Sonnet 4.6 rates ($3/$15 per MTok), saving 5,000 tokens per inspection across 100 monthly sessions: $1.50/month in input tokens alone, higher with output token compounding.

The Anti-Pattern (What NOT to Do)

# Anti-pattern: Claude Code runs 7 separate commands to understand state
curl -s localhost:3000/health
psql -h localhost -d myapp -tc "SELECT count(*) FROM users"
psql -h localhost -d myapp -tc "SELECT count(*) FROM orders WHERE status='pending'"
psql -h localhost -d myapp -tc "SELECT count(*) FROM subscriptions WHERE status='active'"
redis-cli info memory | grep used_memory_human
curl -s localhost:3000/api/queue/stats
tail -5 /var/log/app/error.log

Token cost: 7 x ~245 overhead + variable responses = ~3,500-5,000 tokens.

Each command is also a separate conversational turn, meaning the accumulated context grows with each call. By the 7th call, Claude Code is re-sending all previous tool calls and responses as input.

The Pattern in Action

Step 1: Build the Inspection Endpoint

// src/api/routes/inspect.ts
import { Router } from "express";
import { prisma } from "../db";
import { redis } from "../cache";
import { queue } from "../queue";
const router = Router();
router.get("/api/inspect", async (req, res) => {
  const MAX_ERRORS = 10; // Bounded: NASA P10 Rule 2
  const TIMEOUT_MS = 5000;
  // Parallel data fetch with timeout
  const timeoutPromise = new Promise((_, reject) =>
    setTimeout(() => reject(new Error("Inspection timeout")), TIMEOUT_MS)
  );
  try {
    const [userCount, pendingOrders, activeSubscriptions, queueStats, cacheInfo] =
      await Promise.race([
        Promise.all([
          prisma.user.count(),
          prisma.order.count({ where: { status: "PENDING" } }),
          prisma.subscription.count({ where: { status: "ACTIVE" } }),
          queue.getJobCounts(),
          redis.info("memory"),
        ]),
        timeoutPromise,
      ]) as [number, number, number, object, string];
    const memoryMatch = cacheInfo.match(/used_memory_human:(\S+)/);
    res.json({
      status: "healthy",
      ts: new Date().toISOString(),
      db: { users: userCount, pending_orders: pendingOrders, active_subs: activeSubscriptions },
      queue: queueStats,
      cache: { memory: memoryMatch ? memoryMatch[1] : "unknown" },
      deploy: { version: process.env.APP_VERSION || "unknown", node: process.version },
    });
  } catch (error) {
    res.status(500).json({
      status: "degraded",
      error: error instanceof Error ? error.message : "Unknown error",
      ts: new Date().toISOString(),
    });
  }
});
export default router;

Step 2: Create a CLI Wrapper

#!/bin/bash
# scripts/inspect.sh -- CLI wrapper for state inspection
set -uo pipefail
TIMEOUT=5
RESPONSE=$(curl -sf --max-time "$TIMEOUT" localhost:3000/api/inspect 2>&1)
EXIT_CODE=$?
if [ $EXIT_CODE -ne 0 ]; then
  echo "INSPECT_FAILED: API unreachable (exit code: $EXIT_CODE)"
  exit 1
fi
echo "$RESPONSE"

Step 3: Reference in CLAUDE.md

# CLAUDE.md
## System State
- Quick state check: `./scripts/inspect.sh`
- Returns: DB counts, queue stats, cache info, deploy version
- Use this FIRST before running individual diagnostic commands

Before and After

Metric	Anti-Pattern (7 calls)	Inspection Pattern (1 call)	Improvement
Tool calls	7	1	86% fewer
Token overhead	~1,715	~245	86% reduction
Response tokens	~1,800	~400	78% reduction
Total tokens	~3,515	~645	82% reduction
Conversational turns	7	1	86% fewer
Context accumulation	High (each call compounds)	Minimal	Significant

Step 4: Advanced – Filtered Inspection

For targeted inspections that return only relevant sections:

#!/bin/bash
# scripts/inspect.sh -- supports section filtering
set -uo pipefail
TIMEOUT=5
SECTION="${1:-all}"
RESPONSE=$(curl -sf --max-time "$TIMEOUT" "localhost:3000/api/inspect?section=$SECTION" 2>&1)
EXIT_CODE=$?
if [ $EXIT_CODE -ne 0 ]; then
  echo "INSPECT_FAILED: API unreachable (exit code: $EXIT_CODE)"
  exit 1
fi
echo "$RESPONSE"

# Full inspection
./scripts/inspect.sh all
# Database only (smaller response, fewer tokens)
./scripts/inspect.sh db
# Queue only
./scripts/inspect.sh queue

Filtered inspection further reduces token cost: a database-only inspection returns ~100 tokens of JSON instead of ~400 tokens for the full response. When Claude Code is investigating a database-specific issue, the filtered version saves 300 tokens per call.

When to Use This Pattern

System debugging: When Claude Code needs to understand overall system health before diagnosing a specific issue.
Deployment verification: After deploying, check that all services are healthy in one call.
Routine monitoring tasks: Daily health checks or automated reporting scripts.

Database-Specific State Inspection

For database-heavy applications, create a database-specific inspection endpoint:

// src/api/routes/inspect-db.ts
router.get("/api/inspect/db", async (req, res) => {
  const MAX_TABLES = 30;
  try {
    const tables = await prisma.$queryRaw`
      SELECT table_name,
             (SELECT count(*) FROM information_schema.columns c
              WHERE c.table_name = t.table_name AND c.table_schema = 'public') as column_count
      FROM information_schema.tables t
      WHERE table_schema = 'public'
      ORDER BY table_name
      LIMIT ${MAX_TABLES}`;
    // Get row counts for key tables (bounded: top 10 largest)
    const rowCounts: Record<string, number> = {};
    const keyTables = ["users", "orders", "subscriptions", "invoices", "sessions"];
    for (const table of keyTables) {
      try {
        const result = await prisma.$queryRawUnsafe(
          `SELECT count(*) as count FROM "${table}"`
        );
        rowCounts[table] = Number(result[0].count);
      } catch {
        rowCounts[table] = -1; // Table may not exist
      }
    }
    // Get recent migration status
    const migrations = await prisma.$queryRaw`
      SELECT migration_name, finished_at
      FROM _prisma_migrations
      ORDER BY finished_at DESC
      LIMIT 5`;
    res.json({
      tables: tables.length,
      tableList: tables,
      rowCounts,
      recentMigrations: migrations,
      schemaVersion: migrations[0]?.migration_name || "unknown",
    });
  } catch (error) {
    res.status(500).json({
      error: error instanceof Error ? error.message : "DB inspection failed",
    });
  }
});

This single endpoint replaces: \dt (list tables), \d <table> (describe tables x5), SELECT count(*) (row counts x5), and migration status queries. Total replacement: 10-12 individual queries, saving 3,000-6,000 tokens.

Caching Inspection Results

For expensive inspection queries, cache the results with a short TTL:

#!/bin/bash
# scripts/inspect-cached.sh
# Cache inspection results for 60 seconds
set -uo pipefail
CACHE_FILE="/tmp/claude-inspect-cache.json"
CACHE_TTL=60  # seconds
# Check cache freshness
if [ -f "$CACHE_FILE" ]; then
  CACHE_AGE=$(( $(date +%s) - $(stat -f%m "$CACHE_FILE" 2>/dev/null || stat -c%Y "$CACHE_FILE") ))
  if [ "$CACHE_AGE" -lt "$CACHE_TTL" ]; then
    cat "$CACHE_FILE"
    exit 0
  fi
fi
# Cache miss: fetch fresh data
curl -sf --max-time 5 localhost:3000/api/inspect > "$CACHE_FILE" 2>/dev/null
cat "$CACHE_FILE"

If Claude Code calls the inspection endpoint multiple times in a session (common during debugging), the cache prevents redundant API calls. Each cached call saves ~245 tokens of Bash overhead and the round-trip latency.

When NOT to Use This Pattern

Single-metric checks: If Claude Code only needs one piece of information (e.g., user count), a direct query is simpler and avoids loading unnecessary data.
High-security environments: The inspection endpoint exposes internal state. Restrict access with authentication and do not deploy it to public-facing routes.

Implementation in CLAUDE.md

# CLAUDE.md -- State Inspection Rule
## Diagnostics Protocol
1. ALWAYS run `./scripts/inspect.sh` FIRST when investigating system issues
2. Parse the JSON response before running additional queries
3. Only run individual diagnostic commands if the inspection output is insufficient
4. Never run more than 3 individual diagnostic commands without compacting

This rule guides Claude Code toward the efficient pattern and prevents the 7-command anti-pattern from recurring.

Inspection Pattern for Microservices

In microservice architectures, each service has its own state. A gateway-level inspection endpoint aggregates state from all services:

// src/api/routes/inspect-all.ts
router.get("/api/inspect/all", async (req, res) => {
  const TIMEOUT_MS = 3000;
  const services = [
    { name: "auth", url: "http://auth-service:3001/inspect" },
    { name: "billing", url: "http://billing-service:3002/inspect" },
    { name: "notifications", url: "http://notify-service:3003/inspect" },
  ];
  const results: Record<string, unknown> = {};
  await Promise.all(
    services.map(async (svc) => {
      try {
        const controller = new AbortController();
        const timeout = setTimeout(() => controller.abort(), TIMEOUT_MS);
        const res = await fetch(svc.url, { signal: controller.signal });
        clearTimeout(timeout);
        results[svc.name] = await res.json();
      } catch {
        results[svc.name] = { status: "unreachable" };
      }
    })
  );
  res.json({ services: results, ts: new Date().toISOString() });
});

Without the aggregated endpoint, Claude Code would run 3 separate curl commands (3 x ~445 = ~1,335 tokens of overhead). With the gateway endpoint, one call (~445 tokens) returns all three services’ state. For architectures with 5+ services, the savings scale linearly.

Savings: (N-1) x ~445 tokens per multi-service inspection, where N is the number of services.

Versioning Inspection Responses

As the backend evolves, the inspection endpoint should return version information so Claude Code can detect schema changes:

{
  "inspectVersion": "3",
  "status": "healthy",
  "db": { "users": 14523, "pending_orders": 47 },
  "changedSince": {
    "v2": "Added queue.failed_24h field",
    "v3": "Added deploy.git_sha field"
  }
}

The version field allows CLAUDE.md rules to specify expectations: “Inspection endpoint v3 returns deploy.git_sha. If missing, the backend may be outdated.” This prevents Claude Code from running additional diagnostic commands when the inspection response format changes.

Know your costs → Use our Claude Code Cost Calculator to estimate your monthly spend.

Try it: Paste your error into our Error Diagnostic for an instant fix.

Agent-First Backend Design: Principles for Token Efficiency – the broader architecture philosophy
Reducing Claude Code MCP Round-Trips: Batch Operations Pattern – batch pattern for MCP operations
Claude Code Hooks Guide – automate state checks with hooks