Claude Sonnet 4 (20250514): Model Guide (2026)

claude-sonnet-4-20250514 is Anthropic’s current recommended Sonnet model. Released May 14, 2025, it delivers the best balance of capability and cost in the Claude model lineup. This is the model most developers should use for everyday coding, analysis, and generation tasks.

What the Model ID Means

claude-sonnet-4-20250514
│      │      │ │
│      │      │ └── Release date: May 14, 2025
│      │      └──── Version: 4
│      └───────── Tier: Sonnet (mid-range)
└──────────────── Family: Claude

The full model ID with date suffix ensures you always get this exact model version. Use it in production code to avoid unexpected behavior when Anthropic updates default aliases.

Why Sonnet 4 Is the Default Choice

Sonnet 4 is the model Anthropic actively recommends for most use cases. It replaced Sonnet 4.5 as the standard recommendation because:

  • Best-in-class instruction following: Sonnet 4 is measurably better at following complex, multi-part instructions
  • Strong coding performance: competitive with Opus on many coding benchmarks at one-fifth the cost
  • Reliable tool use: consistent and accurate function calling, critical for agent workflows
  • Good cost-to-quality ratio: $3/$15 per million tokens delivers the most value per dollar

Model Capabilities

Context Window

  • 200,000 tokens input context
  • Handles full codebases, long documents, and extended multi-turn conversations
  • Effective context utilization across the full window (no significant quality degradation at the edges)

Coding

  • Generates, reviews, and debugs code across all major languages
  • Handles multi-file edits and cross-reference analysis
  • Strong at test generation and refactoring
  • Understands framework conventions (React, Django, Rails, Spring, etc.)

Instruction Following

  • Sonnet 4’s defining strength — reliably follows complex instructions
  • Adheres to formatting requirements, output constraints, and multi-step processes
  • Particularly important for CLAUDE.md rules in Claude Code projects

Extended Thinking

  • Supports extended thinking for complex reasoning tasks
  • Allocate a thinking budget to improve accuracy on math, logic, and architecture decisions:
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000,
    },
    messages=[{"role": "user", "content": "Design a distributed cache system"}]
)

Tool Use (Function Calling)

  • Full support for tool definitions and multi-tool sequences
  • Can call multiple tools per turn when appropriate
  • Compatible with the Claude Agent SDK

Vision

  • Accepts images alongside text
  • Analyzes screenshots, diagrams, charts, and UI mockups
  • Useful for front-end development and design review

How to Use Everywhere

Python API

import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=4096,
    messages=[
        {"role": "user", "content": "Write a Python function to parse ISO 8601 dates"}
    ]
)
print(message.content[0].text)

TypeScript API

import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const message = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 4096,
  messages: [
    { role: "user", content: "Write a Python function to parse ISO 8601 dates" }
  ],
});
console.log(message.content[0].text);

Claude Code CLI

# Launch Claude Code with Sonnet 4
claude --model claude-sonnet-4-20250514
# Or use the shorthand alias
claude --model sonnet

CLAUDE.md Configuration

## Model
Default: claude-sonnet-4-20250514
Use this model for all implementation tasks. Escalate to Opus only for
architecture decisions or security reviews.

API Mode (Programmatic)

echo "Refactor the auth module to use middleware" | claude --model claude-sonnet-4-20250514 -p

For details on API mode, see our API mode vs interactive guide.

Pricing

Token Type Cost per 1M Tokens
Input $3.00
Output $15.00
Prompt caching (write) $3.75
Prompt caching (read) $0.30

Prompt Caching

Prompt caching is where Sonnet 4 becomes particularly cost-effective. When the same system prompt or context is sent across multiple requests, cached input tokens cost just $0.30 per million — a 90% reduction.

This matters most for:

  • Agent loops (system prompt + conversation history cached across turns)
  • Batch processing (same instructions, different data)
  • Claude Code sessions (CLAUDE.md content cached)

Monthly Cost Estimates

Usage Level Estimated Monthly Cost
Light (1-2 sessions/day) $10-30
Moderate (5-10 sessions/day) $50-150
Heavy (20+ sessions/day) $200-600
Team (5 developers) $500-2,000

For precise cost tracking, see our ccusage guide.

Comparisons

vs Sonnet 4.5 (Predecessor)

Aspect Sonnet 4 Sonnet 4.5
Instruction following Better Good
Coding accuracy Better Strong
Creative tasks Good Slightly better
Pricing Same Same
Status Current recommended Available but not recommended

Sonnet 4 supersedes Sonnet 4.5 for nearly all use cases. Use Sonnet 4 for new projects. Only use Sonnet 4.5 if you have tested both and Sonnet 4.5 performs better on your specific workload. See our Sonnet 4.5 model guide for details.

vs Opus 4

Aspect Sonnet 4 Opus 4
Complex reasoning Good Significantly better
Instruction following Strong Strong
Multi-step planning Good Better
Cost (input/output) $3/$15 $15/$75
Speed Faster Slower
Best for 80% of tasks Top 20% complexity

Use Sonnet 4 for feature implementation, code generation, bug fixes, test writing, and standard development tasks. This covers the majority of daily work.

Use Opus 4 for system architecture decisions, complex debugging that resists simpler attempts, security audits, and tasks requiring deep multi-step reasoning.

For model routing strategies, see our router guide.

vs Haiku 4.5

Aspect Sonnet 4 Haiku 4.5
Capability Full Limited
Coding Strong Basic-adequate
Cost (input/output) $3/$15 $0.80/$4
Speed Fast Fastest
Best for General development Simple tasks, classification

Use Haiku for: typo fixes, formatting, boilerplate generation, tab completion, and any task where speed matters more than depth. At 75% less cost than Sonnet, Haiku is the right choice for simple operations.


Need the complete toolkit? The Claude Code Playbook includes 200 production-ready templates, decision frameworks, and team setup guides for every Claude Code workflow.

Best Practices

When to Use Sonnet 4

  • Feature implementation: writing new code, endpoints, components
  • Bug fixes: diagnosing and fixing reported issues
  • Code review: analyzing code for quality and correctness
  • Test generation: writing unit and integration tests
  • Refactoring: restructuring code while preserving behavior
  • Documentation: generating code comments, API docs, README content

When to Escalate to Opus

  • The task involves designing a new system from scratch
  • Previous Sonnet attempts produced incorrect solutions
  • The task requires reasoning about security implications across multiple systems
  • You need to analyze a complex bug involving race conditions or distributed systems

When to Downgrade to Haiku

  • Fixing a typo or renaming a variable
  • Generating boilerplate from a template
  • Running a simple search-and-replace
  • Quick classification or categorization tasks

Frequently Asked Questions

Is Sonnet 4 the same as Sonnet 4.5? No. Despite the version numbers, Sonnet 4 is the newer model (released May 2025 vs September 2025 for 4.5). Sonnet 4 is generally better at instruction following and coding tasks.

Should I always use the full model ID? In production code, yes. Use claude-sonnet-4-20250514 to ensure consistent behavior. In interactive Claude Code sessions, the shorthand sonnet is fine.

Does Sonnet 4 support streaming? Yes. Both the API and Claude Code support streaming responses from Sonnet 4.

What are the rate limits? Rate limits depend on your API tier (free, build, scale), not the model. Check console.anthropic.com for your current limits.

Can Sonnet 4 handle a 200K-token input? Yes. Sonnet 4 supports the full 200K context window. Performance remains strong across the full window, though costs scale linearly with input size.

How does Sonnet 4 compare to GPT-4o or Gemini? Sonnet 4 is competitive with GPT-4o on coding benchmarks and generally stronger at instruction following. Direct comparisons depend on the specific task. Test on your own workloads.

Can I use Sonnet 4 with extended thinking and tool use simultaneously?

Yes. Extended thinking and tool use work together. The model can think through a problem before deciding which tools to call.

Is Sonnet 4 available on Amazon Bedrock and Google Vertex AI?

Yes. Sonnet 4 is available through both cloud providers. Check their documentation for the exact model ID format used on each platform.

Does Sonnet 4 support image input in Claude Code?

Claude Code does not currently pass images to the model. Image input is available through the API and Claude.ai web interface.

How often does Anthropic update the Sonnet model?

Anthropic releases new model versions periodically. Always use the full model ID with date suffix in production to avoid unexpected changes when defaults are updated.

Claude Opus 4.1 (20250805) with Thinking-16K

The model ID claude-opus-4-1-20250805-thinking-16k refers to Claude Opus 4.1 with a 16,000-token extended thinking budget. This is a specialized configuration of the Opus model family designed for tasks requiring deep, multi-step reasoning within a constrained thinking window.

What the Model ID Means

claude-opus-4-1-20250805-thinking-16k
│      │     │ │         │
│      │     │ │         └── Extended thinking budget: 16K tokens
│      │     │ └──────────── Release date: August 5, 2025
│      │     └────────────── Version: 4.1
│      └──────────────────── Tier: Opus (highest capability)
└─────────────────────────── Family: Claude

The -thinking-16k suffix indicates this model variant has a 16,000-token budget for extended thinking (chain-of-thought reasoning). The model uses this budget to “think through” complex problems before producing its final answer.

When to Use Opus 4.1 Thinking-16K

This model variant is optimal for:

  • Complex architecture decisions requiring analysis of multiple interacting systems
  • Security audits where the model must trace data flow across multiple files and identify subtle vulnerabilities
  • Mathematical proofs and formal reasoning that benefit from step-by-step derivation
  • Multi-constraint optimization problems where the model must balance competing requirements
  • Debugging distributed systems where root cause analysis requires tracing events across services

Pricing and Cost Considerations

Opus 4.1 with thinking-16k costs more per request than standard Opus because the thinking tokens count toward output token billing:

Component Cost
Input tokens $15 per million
Output tokens (including thinking) $75 per million
16K thinking tokens per request ~$1.20 in thinking output alone

The thinking budget is a maximum, not a fixed cost. Simple questions may use only 2K-5K thinking tokens. The full 16K is consumed only on genuinely complex problems.

API Usage

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-opus-4-1-20250805",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 16000
    },
    messages=[{"role": "user", "content": "Analyze this distributed system for race conditions..."}]
)

Comparison with Sonnet 4 Extended Thinking

Aspect Opus 4.1 Thinking-16K Sonnet 4 Extended Thinking
Reasoning depth Deepest Strong
Cost per request ~$1.20+ thinking alone ~$0.25 thinking
Speed Slower Faster
Best for Research, architecture, proofs Daily coding, implementation

For most development tasks, Sonnet 4 with extended thinking provides sufficient reasoning depth at a fraction of the cost. Reserve Opus 4.1 thinking-16k for problems where Sonnet’s reasoning falls short. See our Claude Code router guide for routing strategies between these models.

Which model? → Take the 5-question quiz in our Model Selector.

Estimate tokens → Calculate your usage with our Token Estimator.

Try it: Estimate your monthly spend with our Cost Calculator.