Batch API Cost Calculator for Claude (2026)

Batch pricing on Claude is straightforward: divide the standard price by two. Opus 4.7 input goes from $5.00 to $2.50 per million tokens. Output goes from $25.00 to $12.50. But calculating the actual cost for a specific workload requires knowing your token distribution, and most teams overestimate input tokens while underestimating output tokens.

The Setup

You are planning to migrate three workloads to the Batch API and need precise cost projections before committing. Workload A is a summarization task (high input, low output). Workload B is content generation (moderate input, high output). Workload C is classification (low input, minimal output).

Each workload has different cost characteristics, and the model choice matters. Running workload B on Opus 4.7 batch costs $15.00 per 1,000 requests. Running it on Haiku 4.5 batch costs $1.50 per 1,000 requests. The calculator below handles all permutations.

The Math

Batch pricing per million tokens (all verified):

Model Batch Input Batch Output Standard Input Standard Output
Opus 4.7 $2.50 $12.50 $5.00 $25.00
Sonnet 4.6 $1.50 $7.50 $3.00 $15.00
Haiku 4.5 $0.50 $2.50 $1.00 $5.00
Opus 4.1 $7.50 $37.50 $15.00 $75.00

Three workload examples (1,000 requests each):

Workload A (summarization): 10K input + 500 output per request

  • Opus 4.7 batch: $25.00 + $6.25 = $31.25
  • Haiku 4.5 batch: $5.00 + $1.25 = $6.25

Workload B (content gen): 5K input + 3K output per request

  • Opus 4.7 batch: $12.50 + $37.50 = $50.00
  • Haiku 4.5 batch: $2.50 + $7.50 = $10.00

Workload C (classification): 2K input + 200 output per request

  • Opus 4.7 batch: $5.00 + $2.50 = $7.50
  • Haiku 4.5 batch: $1.00 + $0.50 = $1.50

The Technique

Here is a comprehensive cost calculator that handles any model, token count, and batch size:

from dataclasses import dataclass
@dataclass
class BatchCostResult:
    model: str
    requests: int
    total_input_tokens: int
    total_output_tokens: int
    batch_input_cost: float
    batch_output_cost: float
    batch_total: float
    standard_total: float
    savings: float
    savings_pct: float
    cost_per_request: float
    def __str__(self) -> str:
        return (
            f"Model: {self.model}\n"
            f"Requests: {self.requests:,}\n"
            f"Batch cost: ${self.batch_total:.2f} "
            f"(in: ${self.batch_input_cost:.2f}, "
            f"out: ${self.batch_output_cost:.2f})\n"
            f"Standard cost: ${self.standard_total:.2f}\n"
            f"Savings: ${self.savings:.2f} ({self.savings_pct:.1f}%)\n"
            f"Per request: ${self.cost_per_request:.4f}"
        )
BATCH_PRICES = {
    "opus-4.7": {"in": 2.50, "out": 12.50, "std_in": 5.00, "std_out": 25.00},
    "sonnet-4.6": {"in": 1.50, "out": 7.50, "std_in": 3.00, "std_out": 15.00},
    "haiku-4.5": {"in": 0.50, "out": 2.50, "std_in": 1.00, "std_out": 5.00},
    "opus-4.1": {"in": 7.50, "out": 37.50, "std_in": 15.00, "std_out": 75.00},
}
def calculate_batch_cost(
    model: str,
    num_requests: int,
    avg_input_tokens: int,
    avg_output_tokens: int
) -> BatchCostResult:
    """Calculate batch vs standard API cost."""
    if model not in BATCH_PRICES:
        raise ValueError(f"Unknown model: {model}. "
                        f"Options: {list(BATCH_PRICES.keys())}")
    p = BATCH_PRICES[model]
    total_in = num_requests * avg_input_tokens
    total_out = num_requests * avg_output_tokens
    batch_in_cost = total_in * p["in"] / 1e6
    batch_out_cost = total_out * p["out"] / 1e6
    batch_total = batch_in_cost + batch_out_cost
    std_in_cost = total_in * p["std_in"] / 1e6
    std_out_cost = total_out * p["std_out"] / 1e6
    std_total = std_in_cost + std_out_cost
    savings = std_total - batch_total
    return BatchCostResult(
        model=model,
        requests=num_requests,
        total_input_tokens=total_in,
        total_output_tokens=total_out,
        batch_input_cost=batch_in_cost,
        batch_output_cost=batch_out_cost,
        batch_total=batch_total,
        standard_total=std_total,
        savings=savings,
        savings_pct=(savings / std_total * 100) if std_total > 0 else 0,
        cost_per_request=batch_total / num_requests if num_requests > 0 else 0
    )
# Compare all models for the same workload
for model in BATCH_PRICES:
    result = calculate_batch_cost(model, 10000, 5000, 3000)
    print(result)
    print()

For a quick command-line calculation:

# Quick batch cost estimator
python3 -c "
import sys
models = {
    'opus-4.7': (2.50, 12.50),
    'sonnet-4.6': (1.50, 7.50),
    'haiku-4.5': (0.50, 2.50),
}
reqs = 10000
inp = 5000
out = 3000
print(f'Batch cost for {reqs:,} requests ({inp:,} in + {out:,} out tokens each):')
print(f'{\"\":-^60}')
for name, (pin, pout) in models.items():
    cost_in = reqs * inp * pin / 1e6
    cost_out = reqs * out * pout / 1e6
    total = cost_in + cost_out
    per_req = total / reqs
    print(f'{name:>12}: \${total:>10,.2f} total  (\${per_req:.4f}/request)')
"

Output:

Batch cost for 10,000 requests (5,000 in + 3,000 out tokens each):
------------------------------------------------------------
   opus-4.7: $    500.00 total  ($0.0500/request)
 sonnet-4.6: $    300.00 total  ($0.0300/request)
  haiku-4.5: $    100.00 total  ($0.0100/request)

To estimate tokens from your actual data before batch submission:

def estimate_tokens_from_text(text: str) -> int:
    """Rough token estimate: ~4 chars per token for English."""
    return len(text) // 4
# Sample your dataset to estimate average tokens
import json
import random
data = json.load(open("dataset.json"))
sample = random.sample(data, min(100, len(data)))
input_tokens = [estimate_tokens_from_text(item["prompt"]) for item in sample]
avg_input = sum(input_tokens) // len(input_tokens)
print(f"Estimated avg input tokens: {avg_input}")
print(f"Run calculator with avg_input_tokens={avg_input}")

The Tradeoffs

Cost calculators have inherent limitations:

  • Output token estimation is hard. You know your input tokens precisely, but output tokens depend on the model’s response length. Use max_tokens as an upper bound and actual historical data for better estimates.
  • Cache stacking changes the math. If you combine batch with prompt caching, the effective input rate drops to 5% of standard ($0.25/MTok on Opus 4.7 for cached batch reads). The calculator above does not account for caching – add a separate calculation layer.
  • Token counting varies by model. Opus 4.7 uses a new tokenizer that may produce up to 35% more tokens for the same text compared to older models. A prompt that costs X tokens on Sonnet may cost 1.35X on Opus.
  • No free tier. Unlike some competitors, Claude batch API requires API credits. There is no free batch processing tier.

Implementation Checklist

  1. Sample 100 representative requests from your workload
  2. Count actual input tokens using the Anthropic tokenizer or estimate at 4 chars/token
  3. Estimate output tokens from historical response lengths or max_tokens settings
  4. Run the calculator for all candidate models
  5. Compare batch cost against your current standard API spend
  6. Select the model that meets quality requirements at the lowest batch cost
  7. Start with a small test batch (100 requests) before committing to 100K

Measuring Impact

After running your first production batch:

  • Predicted vs actual cost: Compare calculator output against the batch invoice. Should match within 5% if token estimates are accurate.
  • Output token variance: Compare estimated output tokens against actual. Large variance means your estimator needs calibration.
  • Model quality check: If you downgraded from Opus to Sonnet for cost savings, sample 50 outputs and verify quality meets your threshold.
  • Adjust future calculator inputs based on actual token usage data from completed batches

Which model? → Take the 5-question quiz in our Model Selector.

Estimate tokens → Calculate your usage with our Token Estimator.

Try it: Estimate your monthly spend with our Cost Calculator.

See Also