Anthropic Message Batches API Guide (2026)

The Problem

You need to process hundreds or thousands of Claude API requests but sending them one at a time is slow, expensive, and hits rate limits. Real-time responses are not required for your use case.

Quick Fix

Use the Message Batches API to submit up to 100,000 requests in a single batch at 50% of standard API pricing:

import anthropic
from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request
client = anthropic.Anthropic()
message_batch = client.messages.batches.create(
 requests=[
 Request(
 custom_id="request-1",
 params=MessageCreateParamsNonStreaming(
 model="claude-sonnet-4-6",
 max_tokens=1024,
 messages=[{"role": "user", "content": "Summarize this document..."}],
 ),
 ),
 ]
)
print(message_batch.id)

What’s Happening

The Message Batches API processes requests asynchronously instead of synchronously. When you submit a batch, Anthropic queues all requests and processes them in parallel. Most batches complete within 1 hour. Each request in the batch is handled independently, so one failure does not affect others.

The key advantage is cost: all batch usage is charged at 50% of standard API prices. For Claude Sonnet 4.6, that means $1.50 per million input tokens and $7.50 per million output tokens instead of $3 and $15 respectively.

Step-by-Step Fix

Step 1: Prepare your batch requests

Each request needs a unique custom_id (1-64 alphanumeric characters, hyphens, and underscores) and a params object with standard Messages API parameters:

import anthropic
from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
from anthropic.types.messages.batch_create_params import Request
client = anthropic.Anthropic()
requests = []
for i, doc in enumerate(documents):
 requests.append(
 Request(
 custom_id=f"doc-{i}",
 params=MessageCreateParamsNonStreaming(
 model="claude-sonnet-4-6",
 max_tokens=1024,
 messages=[
 {"role": "user", "content": f"Summarize: {doc}"}
 ],
 ),
 )
 )
message_batch = client.messages.batches.create(requests=requests)
print(f"Batch ID: {message_batch.id}")

Step 2: Poll for completion

Check the batch status until processing finishes:

import time
while True:
 batch = client.messages.batches.retrieve(message_batch.id)
 if batch.processing_status == "ended":
 break
 print(f"Status: {batch.processing_status} - "
 f"{batch.request_counts.succeeded} succeeded, "
 f"{batch.request_counts.processing} processing")
 time.sleep(30)

Step 3: Retrieve results

Stream results for the completed batch:

for result in client.messages.batches.results(message_batch.id):
 if result.result.type == "succeeded":
 print(f"{result.custom_id}: {result.result.message.content[0].text}")
 elif result.result.type == "errored":
 print(f"{result.custom_id}: Error - {result.result.error}")

Step 4: Handle errors and expiration

Individual requests can fail without affecting the batch. Batches expire if processing does not complete within 24 hours. Results are available for 29 days after creation.

batch = client.messages.batches.retrieve(message_batch.id)
counts = batch.request_counts
print(f"Succeeded: {counts.succeeded}")
print(f"Errored: {counts.errored}")
print(f"Expired: {counts.expired}")

Step 5: Use with prompt caching for better performance

Since batches can take time to process, use the 1-hour cache duration for shared context:

requests.append(
 Request(
 custom_id=f"doc-{i}",
 params=MessageCreateParamsNonStreaming(
 model="claude-sonnet-4-6",
 max_tokens=1024,
 system=[{
 "type": "text",
 "text": shared_system_prompt,
 "cache_control": {"type": "ephemeral"}
 }],
 messages=[
 {"role": "user", "content": f"Analyze: {doc}"}
 ],
 ),
 )
)

Batch limits

  • Maximum 100,000 requests or 256 MB per batch, whichever comes first
  • Batches expire after 24 hours if not complete
  • Results available for 29 days after creation
  • All active Claude models are supported

Pricing reference

Model Batch Input Batch Output
Claude Opus 4.6 $2.50/MTok $12.50/MTok
Claude Sonnet 4.6 $1.50/MTok $7.50/MTok
Claude Haiku 4.5 $0.50/MTok $2.50/MTok

Prevention

Design your batch pipelines to handle partial failures. Always check request_counts after processing ends. Implement retry logic for expired or errored requests by resubmitting them in a new batch.

For large-scale evaluations, split work into multiple batches under the 100K request limit and process them concurrently for maximum throughput.


Level Up Your Claude Code Workflow

The developers who get the most out of Claude Code aren’t just fixing errors — they’re running multi-agent pipelines, using battle-tested CLAUDE.md templates, and shipping with production-grade operating principles.



I'm a solo developer in Vietnam. 50K Chrome extension users. $500K+ on Upwork. 5 Claude Max subscriptions running agent fleets in parallel. These are my actual CLAUDE.md templates, orchestration configs, and prompts. Not a course. Not theory. The files I copy into every project before I write a line of code. **[See what's inside →](https://zovo.one/lifetime?utm_source=ccg&utm_medium=cta-default&utm_campaign=anthropic-message-batches-api-guide)** $99 once. Free forever. 47/500 founding spots left.

Which model? → Take the 5-question quiz in our Model Selector.

Try it: Estimate your monthly spend with our Cost Calculator.

Common Questions

How do I get started with anthropic message batches api?

Begin with the setup instructions in this guide. Install the required dependencies, configure your environment, and test with a small project before scaling to your full codebase.

What are the prerequisites?

You need a working development environment with Node.js or Python installed. Familiarity with the command line and basic Git operations is helpful. No advanced AI knowledge is required.

Can I use this with my existing development workflow?

Yes. These techniques integrate with standard development tools and CI/CD pipelines. Start by adding them to a single project and expand once you have verified the benefits.

Where can I find more advanced techniques?

Explore the related resources below for deeper coverage. The Claude Code documentation and community forums also provide advanced patterns and real-world case studies.