Claude Code Context Window Management (2026)

Last updated: April 20, 2026

Every Claude Code session operates within a finite context window. Understanding how this window fills, what gets prioritized, and how to manage it determines whether your sessions are productive or plagued by forgotten instructions and repeated mistakes.

How the Context Window Works

The context window is the total amount of text Claude Code can process at once. It includes:

System prompt (~4K tokens) – Built-in instructions for Claude Code
CLAUDE.md content (variable) – Your project rules
Conversation history – All messages back and forth
Tool outputs – File contents, command results, search results
MCP server tool definitions – Each MCP server adds tool descriptions

As the window fills, older content is compressed or dropped. The system prompt and CLAUDE.md remain (they are re-sent with each turn), but earlier conversation messages may be summarized.

Token Budget Breakdown

For a typical session with a 200K token window:

Component	Tokens	% of Window
System prompt	~4,000	2%
CLAUDE.md (1,500 words)	~2,000	1%
MCP tool definitions (3 servers)	~1,500	0.75%
Available for conversation	~192,500	96.25%

This looks generous, but a single large file read can consume 5,000-20,000 tokens. Reading 10 files uses 50,000-200,000 tokens. Sessions fill faster than you expect.

When Context Loss Happens

Symptoms of context window pressure:

Claude Code “forgets” decisions from earlier in the session
It re-reads files it already read
It contradicts earlier recommendations
It suggests patterns you already rejected
Responses become more generic and less project-specific

Management Strategies

Strategy 1: Keep CLAUDE.md Lean

Every word in CLAUDE.md counts. Audit it quarterly:

## Before (verbose -- 500 words, ~650 tokens)
When working with the authentication module, please make sure to always
use the NextAuth.js v5 configuration that is located in the src/lib/auth.ts
file. This file contains all of our authentication configuration including
providers, callbacks, and session handling...
## After (concise -- 50 words, ~65 tokens)
## Auth
- NextAuth.js v5: src/lib/auth.ts
- Session: JWT in httpOnly cookies
- Providers: Google, GitHub, Email
- DO NOT add new providers without approval

90% token reduction, same information.

Strategy 2: Chunk Sessions

One task per session. When the task is done, start a new session:

Session 1: "Add the user preferences API endpoint" -> Done -> End
Session 2: "Write tests for the user preferences endpoint" -> Done -> End
Session 3: "Add documentation for the user preferences API" -> Done -> End

Each session starts with a fresh context window. Use PROGRESS.md for continuity (see context loss fix guide).

Strategy 3: Minimize File Reads

Tell Claude Code exactly which files to read instead of letting it search:

Token-expensive prompt: “Find the auth configuration and update it” Token-efficient prompt: “Update src/lib/auth.ts to add rate limiting to the login callback”

The second prompt reads one file. The first might read ten.

Strategy 4: Use .claudeignore

Prevent Claude Code from reading irrelevant files:

# .claudeignore
dist/
build/
node_modules/
coverage/
*.min.js
__snapshots__/
*.generated.ts

This does not save context directly but prevents token-wasting file discovery.

Strategy 5: Scope MCP Servers

Each MCP server adds tool definitions to the context. Only enable servers you need:

{
  "mcpServers": {
    "project-db": { "..." }
  }
}

Three unused MCP servers add ~1,500 tokens of overhead to every message.

Monitoring Context Usage

Use ccusage (13K+ stars) to track token consumption:

npx ccusage

Look for:

Sessions with disproportionately high input tokens (too many file reads)
Sessions with long conversation histories (should have been split)
Patterns where the same files are read multiple times (context loss occurring)

Context-Aware CLAUDE.md Design

Structure your CLAUDE.md so the most important rules appear first:

## Critical Rules (always visible)
- Framework: Next.js 14 App Router
- TypeScript strict mode
- No 'any' types
## Architecture (detailed reference)
[Detailed architecture -- Claude Code can re-read this section if needed]
## Conventions (detailed reference)
[Detailed conventions -- same as above]

The critical rules at the top are most likely to remain in active context throughout the session.

Advanced: Multi-Turn Context Patterns

The Checkpoint Pattern

Ask Claude Code to summarize its understanding every 10 messages:

You: "Before we continue, summarize what we have decided so far"
Claude Code: [summary]
You: "Correct. Continue with the next step."

This refreshes critical decisions in the active context.

The Context Refresh Pattern

When you notice context loss, re-state the key constraints:

You: "Reminder: we are using Fastify (not Express), Vitest (not Jest),
and all functions need explicit return types. Continue with the refactoring."

This costs ~50 tokens but prevents 500-token mistakes.

Session Splitting Script

For large tasks, split into context-efficient sessions:

# Step 1: Planning session (small context, produces a plan)
claude -p "Read requirements.md and create a numbered list of
independent subtasks. Each subtask should be completable without
knowledge of other subtasks." > plan.md
# Step 2: Execute each subtask in a fresh session
while IFS= read -r task; do
  claude -p "Complete this subtask: $task" --max-turns 15
done < plan.md

The claude-task-master (27K+ stars) automates this decomposition pattern with its PRD parser and structured task output.

FAQ

Does a longer CLAUDE.md always mean better results?

No. A 5,000-word CLAUDE.md consumes ~6,500 tokens per turn and may cause the model to weight less-important rules equally with critical ones. Shorter and more focused is better.

Can I increase the context window?

The context window is determined by the model. As Claude models evolve, context windows grow. Claude Code automatically uses the available window.

Do hooks affect the context window?

Hook output is returned as context and consumes tokens. Keep hook output brief (pipe through tail -5 or head -10).

How does multi-agent affect context?

Each sub-agent gets its own context window. The orchestrator’s window is shared with sub-agent results. See the multi-agent guide.

For session management strategies, see the context loss fix guide. For CLAUDE.md optimization, read the CLAUDE.md best practices guide. For overall cost management, see the pricing guide.

Configure it → Build your MCP config with our MCP Config Generator.

Claude Code Context Window Management (2026)

How the Context Window Works

Token Budget Breakdown

When Context Loss Happens

Management Strategies

Strategy 1: Keep CLAUDE.md Lean

Strategy 2: Chunk Sessions

Strategy 3: Minimize File Reads

Strategy 4: Use .claudeignore

Strategy 5: Scope MCP Servers

Monitoring Context Usage

Context-Aware CLAUDE.md Design

Advanced: Multi-Turn Context Patterns

The Checkpoint Pattern

The Context Refresh Pattern

Session Splitting Script

FAQ

Does a longer CLAUDE.md always mean better results?

Can I increase the context window?

Do hooks affect the context window?

How does multi-agent affect context?

See Also

About the Author

How the Context Window Works

Token Budget Breakdown

When Context Loss Happens

Management Strategies

Strategy 1: Keep CLAUDE.md Lean

Strategy 2: Chunk Sessions

Strategy 3: Minimize File Reads

Strategy 4: Use .claudeignore

Strategy 5: Scope MCP Servers

Monitoring Context Usage

Context-Aware CLAUDE.md Design

Advanced: Multi-Turn Context Patterns

The Checkpoint Pattern

The Context Refresh Pattern

Session Splitting Script

FAQ

Does a longer CLAUDE.md always mean better results?

Can I increase the context window?

Do hooks affect the context window?

How does multi-agent affect context?

See Also

About the Author

Related Guides