Claude API Cost Dashboard Setup Guide (2026)

A team running Claude Opus 4.7 for all requests discovered at month’s end they’d spent $5,000 – but 60% of their requests were simple classifications that Haiku 4.5 could handle at $1.00/MTok instead of $5.00/MTok. After building a cost dashboard and rerouting those requests, they cut spending to $2,600/month. The dashboard paid for itself in the first week. Without visibility into per-model, per-request costs, you’re flying blind. For a deeper dive, see Claude Batch Processing 100K Requests Guide.

The Setup

Every Claude API response includes a usage object containing input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokens, and server_tool_use metrics. These fields provide everything needed to calculate exact per-request costs. The problem is that most teams never aggregate this data. They check the Claude Console monthly billing page and react after the fact. A real-time dashboard transforms reactive cost management into proactive optimization by showing spend patterns as they develop. For a deeper dive, see Claude Tool Use Hidden Token Costs Explained.

The Math

A team processing 50,000 requests/day across three use cases:

Before dashboard (all Opus 4.7):

  • Classification: 30,000 req x 2K input x $5/MTok + 30K x 200 output x $25/MTok = $300 + $150 = $450/day
  • Summarization: 15,000 req x 10K input x $5/MTok + 15K x 2K output x $25/MTok = $750 + $750 = $1,500/day
  • Code generation: 5,000 req x 20K input x $5/MTok + 5K x 5K output x $25/MTok = $500 + $625 = $1,125/day
  • Daily total: $3,075 ($92,250/month)

After dashboard reveals usage patterns:

  • Classification on Haiku 4.5 ($1/$5): $60 + $30 = $90/day
  • Summarization on Sonnet 4.6 ($3/$15): $450 + $450 = $900/day
  • Code generation stays on Opus 4.7: $1,125/day
  • Daily total: $2,115 ($63,450/month)

Monthly savings: $28,800 (31%) – driven entirely by visibility

The Technique

Build a cost tracking middleware that logs every API response to a database.

import anthropic
import sqlite3
import time
from datetime import datetime
from functools import wraps
# Verified pricing (2026-04-19)
PRICING = {
    "claude-opus-4-7": {"input": 5.00, "output": 25.00,
                         "cache_read": 0.50, "cache_write": 6.25},
    "claude-sonnet-4-6": {"input": 3.00, "output": 15.00,
                           "cache_read": 0.30, "cache_write": 3.75},
    "claude-haiku-4-5": {"input": 1.00, "output": 5.00,
                          "cache_read": 0.10, "cache_write": 1.25},
}
def init_db(db_path: str = "claude_costs.db") -> sqlite3.Connection:
    conn = sqlite3.connect(db_path)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS api_costs (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            timestamp TEXT NOT NULL,
            model TEXT NOT NULL,
            project TEXT DEFAULT 'default',
            input_tokens INTEGER,
            output_tokens INTEGER,
            cache_read_tokens INTEGER DEFAULT 0,
            cache_write_tokens INTEGER DEFAULT 0,
            web_searches INTEGER DEFAULT 0,
            input_cost REAL,
            output_cost REAL,
            cache_cost REAL,
            search_cost REAL,
            total_cost REAL,
            latency_ms INTEGER
        )
    """)
    conn.execute("""
        CREATE INDEX IF NOT EXISTS idx_timestamp
        ON api_costs(timestamp)
    """)
    conn.execute("""
        CREATE INDEX IF NOT EXISTS idx_model
        ON api_costs(model)
    """)
    conn.commit()
    return conn
def calculate_cost(model: str, usage) -> dict:
    """Calculate cost breakdown from usage object."""
    prices = PRICING.get(model, PRICING["claude-sonnet-4-6"])
    input_cost = usage.input_tokens * prices["input"] / 1_000_000
    output_cost = usage.output_tokens * prices["output"] / 1_000_000
    cache_read = getattr(usage, "cache_read_input_tokens", 0) or 0
    cache_write = getattr(usage, "cache_creation_input_tokens", 0) or 0
    cache_cost = (
        cache_read * prices["cache_read"] / 1_000_000
        + cache_write * prices["cache_write"] / 1_000_000
    )
    server_tools = getattr(usage, "server_tool_use", None)
    web_searches = 0
    if server_tools and hasattr(server_tools, "web_search_requests"):
        web_searches = server_tools.web_search_requests or 0
    search_cost = web_searches * 0.01  # $10 per 1,000 searches

    return {
        "input_cost": round(input_cost, 6),
        "output_cost": round(output_cost, 6),
        "cache_cost": round(cache_cost, 6),
        "search_cost": round(search_cost, 4),
        "total_cost": round(input_cost + output_cost + cache_cost + search_cost, 6),
        "web_searches": web_searches,
        "cache_read_tokens": cache_read,
        "cache_write_tokens": cache_write,
    }
class TrackedClient:
    """Wrapper that logs every API call's cost."""
    def __init__(self, project: str = "default",
                 db_path: str = "claude_costs.db"):
        self.client = anthropic.Anthropic()
        self.project = project
        self.conn = init_db(db_path)
    def create(self, **kwargs) -> anthropic.types.Message:
        start = time.time()
        response = self.client.messages.create(**kwargs)
        latency = int((time.time() - start) * 1000)
        model = kwargs.get("model", "unknown")
        costs = calculate_cost(model, response.usage)
        self.conn.execute("""
            INSERT INTO api_costs
            (timestamp, model, project, input_tokens, output_tokens,
             cache_read_tokens, cache_write_tokens, web_searches,
             input_cost, output_cost, cache_cost, search_cost,
             total_cost, latency_ms)
            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
        """, (
            datetime.utcnow().isoformat(),
            model, self.project,
            response.usage.input_tokens,
            response.usage.output_tokens,
            costs["cache_read_tokens"],
            costs["cache_write_tokens"],
            costs["web_searches"],
            costs["input_cost"], costs["output_cost"],
            costs["cache_cost"], costs["search_cost"],
            costs["total_cost"], latency
        ))
        self.conn.commit()
        return response
# Usage
tracker = TrackedClient(project="customer-support")
response = tracker.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Classify this ticket"}]
)

Query the dashboard for daily summaries:

-- Daily cost by model
SELECT
    DATE(timestamp) as day,
    model,
    COUNT(*) as requests,
    SUM(total_cost) as total_spend,
    AVG(total_cost) as avg_cost_per_request
FROM api_costs
GROUP BY DATE(timestamp), model
ORDER BY day DESC, total_spend DESC;

The Tradeoffs

Logging every API call adds a database write per request, which introduces latency (typically 1-5ms for SQLite, more for remote databases). For high-throughput systems, buffer logs in memory and flush periodically rather than writing synchronously. The cost calculation relies on hardcoded pricing – you’ll need to update the PRICING dictionary when Anthropic adjusts rates. SQLite works for single-process applications but needs replacement with PostgreSQL or similar for multi-process production deployments.

Implementation Checklist

  • Create the cost tracking database with the schema above
  • Wrap your Anthropic client with the TrackedClient class
  • Tag each request with a project identifier for attribution
  • Build summary queries for daily, weekly, and monthly views
  • Set up a simple web interface (Flask/FastAPI) or connect to Grafana
  • Configure alerts for daily spend thresholds (e.g., alert at $100/day)

Measuring Impact

The dashboard itself doesn’t save money – the insights it surfaces do. Track two meta-metrics: time-to-insight (how quickly you spot an anomaly) and cost-per-insight-action (how much each dashboard-driven optimization saves). Most teams find their first actionable insight within 48 hours of deploying a dashboard, typically a model routing optimization worth $500-$2,000/month. We cover this further in 5-Minute vs 1-Hour Cache: Which Saves More.

Which model? → Take the 5-question quiz in our Model Selector.

Try it: Estimate your monthly spend with our Cost Calculator.

See Also