Karpathy Context Engineering + NASA P10 (2026)

What This Means for Claude Code Users

The CCG production framework merges two proven systems: Andrej Karpathy’s context engineering principles (controlling what enters an LLM’s context window) and NASA’s Power of 10 rules (10 coding constraints that prevent catastrophic failures). Together, they produce Claude Code workflows that are both cost-efficient and reliable. Teams using this framework report 40-60% lower token costs and near-zero runaway sessions – saving $20-$60 per developer per month on Sonnet 4.6 API usage.

The Concept

Context engineering, as defined by Karpathy, goes beyond prompt engineering. It is the art and science of filling the context window with the right information at the right time. In Claude Code, the context window is the single biggest cost driver: every token in context is re-sent as input on every subsequent turn, and input tokens cost $3 per million on Sonnet 4.6.

Karpathy identifies the fundamental challenge: “The hottest new programming language is English” – but the context window is finite and expensive. The solution is engineering discipline applied to context management, not just clever prompting.

NASA Power of 10 rules, published by Gerard Holzmann at NASA/JPL, are 10 coding rules designed for safety-critical systems. The relevant rules for Claude Code workflows are:

  1. Restrict to simple control flow – no recursion, no complex branching
  2. Give all loops a fixed upper bound – no unbounded iteration
  3. Do not use dynamic memory after initialization – plan context allocation upfront
  4. Keep functions short – under 60 lines (one screen)
  5. Use a minimum of two assertions per function – verify pre/post conditions
  6. Restrict scope of data – minimize what is visible
  7. Check return values – handle every error
  8. Limit preprocessor use – keep tool chains simple
  9. Restrict pointer use – avoid indirection (in Claude Code: direct references over indirect)
  10. Compile with all warnings – zero tolerance for warnings

The CCG framework adapts both systems into a unified set of practices for Claude Code production use.

How It Works in Practice

Example 1: Bounded Sessions (P10 Rule 2 + Context Engineering)

NASA P10 says: all loops must have a fixed upper bound. Applied to Claude Code sessions:

# UNBOUNDED (violates P10 Rule 2):
claude -p "Fix all the bugs in the project"
# This can run for hours, consuming 500K+ tokens ($1.50+)
# BOUNDED (P10 compliant + context-engineered):
claude --max-turns 15 \
  --allowedTools "Read,Glob,Grep,Edit,Bash" \
  -p "Fix the authentication timeout bug in src/auth/session.ts.
      The session expires after 1 hour instead of 24 hours.
      Read the file, find the timeout constant, update it, and run tests."

The bounded version:

  • --max-turns 15: hard loop bound (P10 Rule 2)
  • --allowedTools: restricted scope (P10 Rule 6)
  • Scoped prompt: specific file and task (context engineering – select lever)
  • Expected cost: ~30,000 tokens ($0.09) vs. unbounded: 200K-500K tokens ($0.60-$1.50)

Savings: $0.51-$1.41 per session (85-94% reduction)

Example 2: Pre-Computed Context with Assertions (P10 Rules 3, 5)

P10 Rule 3: allocate resources at initialization. P10 Rule 5: assert pre/post conditions.

# CLAUDE.md -- Pre-computed context (Rule 3: initialize upfront)
## Project: payment-service
## Architecture: Express + Prisma + Stripe + PostgreSQL
## Entry: src/index.ts
## Key paths:
##   Routes: src/routes/ (one file per resource)
##   Services: src/services/ (business logic)
##   Models: prisma/schema.prisma (database schema)
## Assertions (Rule 5: always verify)
- Before any deployment: `pnpm test` must pass with 0 failures
- Before any database change: `pnpm prisma validate` must pass
- After any edit: `npx tsc --noEmit` must show 0 errors
- After any API change: verify with `curl -sf localhost:3000/health`
## Error Budget
- Max retry cycles per task: 3 (Rule 2: bounded)
- If 3 retries fail, stop and report the error

The assertions serve double duty: they catch real errors (safety) and they prevent Claude Code from entering diagnostic loops (cost). Without the “max 3 retries” rule, a failing test can trigger 10+ retry cycles, each consuming 5,000-15,000 tokens.

Savings: 25,000-75,000 tokens per failed-retry scenario ($0.08-$0.23)

Example 3: Short Functions with Direct References (P10 Rules 4, 9)

P10 Rule 4: functions under 60 lines. P10 Rule 9: avoid indirection. Applied to code that Claude Code must read and modify:

// BAD: 200-line function with deep indirection
// Claude Code cost to understand: Read (~4,000 tokens) + trace references (~6,000 tokens)
// = ~10,000 tokens just to understand what the function does
// GOOD: P10-compliant -- short function, direct references
export async function processPayment(
  orderId: string,
  amount: number,
  currency: string
): Promise<Result<PaymentResult, PaymentError>> {
  // Assertion 1: pre-condition (P10 Rule 5)
  assert(amount > 0 && amount < 100000, `Invalid amount: ${amount}`);
  assert(currency.length === 3, `Invalid currency code: ${currency}`);
  const order = await prisma.order.findUnique({ where: { id: orderId } });
  if (!order) {
    return { ok: false, error: { code: "ORDER_NOT_FOUND", orderId } };
  }
  const charge = await stripe.charges.create({
    amount: Math.round(amount * 100),
    currency,
    source: order.paymentMethodId,
    metadata: { orderId },
  });
  // Assertion 2: post-condition (P10 Rule 5)
  assert(charge.id, "Stripe returned charge without ID");
  await prisma.order.update({
    where: { id: orderId },
    data: { status: "paid", stripeChargeId: charge.id, paidAt: new Date() },
  });
  return { ok: true, value: { chargeId: charge.id, amount, currency } };
}

Claude Code reads this 30-line function in ~600 tokens and understands it completely. No tracing through abstractions, no following class hierarchies, no reading base classes.

Savings: ~9,400 tokens per complex function (Read + trace vs. direct read)

Example 4: The Full Framework in a Skill File

# .claude/skills/ccg-framework.md
## CCG Production Framework
### Session Rules (Context Engineering + P10)
1. BOUNDED: Every session has --max-turns set. Default: 20.
2. SCOPED: Every prompt specifies the target file or directory.
3. PRE-COMPUTED: Read CLAUDE.md before starting. Do not re-discover known facts.
4. VERIFIED: Run assertions after every change (tsc, test, lint).
5. COMPRESSED: Run /compact after discovery phase, before implementation.
### Code Rules (NASA P10)
1. Functions: max 60 lines
2. Loops: always have upper bounds (for i < MAX, not while true)
3. Assertions: 2 per function minimum (pre-condition + post-condition)
4. Error handling: every function returns Result<T, Error> or throws typed errors
5. No warnings: `npx tsc --noEmit` and `npx eslint` must both pass clean
### Cost Rules
1. Discovery budget: max 30% of session tokens
2. Run /compact before implementation if discovery exceeded 40K tokens
3. Max 3 retry cycles for any single failure
4. If estimated session will exceed 150K tokens, split into subagent tasks
5. Track costs: check /cost at session end, review with ccusage weekly

Token Cost Impact

Framework Element Without Framework With Framework Savings
Unbounded session 200K-500K tokens 30K-80K tokens $0.51-$1.26
Discovery bloat 50K-120K tokens 15K-30K tokens $0.11-$0.27
Retry loops 25K-75K tokens 10K-20K tokens $0.05-$0.17
Code comprehension 10K tokens/function 600 tokens/function $0.03/function
Typical session 150K-300K tokens 40K-100K tokens $0.33-$0.60

Monthly impact for a developer at 100 sessions/month: $33-$60 savings on Sonnet 4.6.

Implementation Checklist

  • Add CCG framework rules to CLAUDE.md (or reference a skill file)
  • Set default --max-turns in shell aliases for all Claude Code invocations
  • Add pre-condition and post-condition assertions to all critical functions
  • Refactor any function over 60 lines into smaller units
  • Create an architecture map in CLAUDE.md (under 300 tokens)
  • Add assertion commands (tsc, test, lint) to CLAUDE.md as required post-edit checks
  • Set up /compact usage discipline: always compact after discovery, before implementation
  • Review ccusage weekly to track framework impact on costs

Example 5: The Complete Framework in a CI Pipeline

The framework extends beyond interactive sessions to automated workflows:

#!/bin/bash
# .github/scripts/claude-pr-review.sh
# Automated PR review using the CCG Production Framework
set -euo pipefail
# P10 Rule 2: Bounded execution
MAX_TURNS=12
# P10 Rule 6: Restricted scope
ALLOWED_TOOLS="Read,Glob,Grep"
# Context engineering: pre-compute relevant files
CHANGED_FILES=$(git diff --name-only HEAD~1 | head -30)
FILE_COUNT=$(echo "$CHANGED_FILES" | wc -l | tr -d ' ')
# P10 Rule 2: Adjust turns based on scope
if [ "$FILE_COUNT" -lt 5 ]; then
  MAX_TURNS=8
elif [ "$FILE_COUNT" -lt 15 ]; then
  MAX_TURNS=12
else
  MAX_TURNS=15
fi
# Context engineering: provide structured input (write lever)
REVIEW_PROMPT="Review these ${FILE_COUNT} changed files for bugs and security issues.
Files: ${CHANGED_FILES}
Output format (structured for machine parsing):
- file: <path>
  line: <number>
  severity: high|medium|low
  issue: <description>
If no issues found, output: NO_ISSUES_FOUND"
# Execute with all framework constraints applied
claude --max-turns "$MAX_TURNS" \
  --allowedTools "$ALLOWED_TOOLS" \
  --model sonnet \
  -p "$REVIEW_PROMPT"
# P10 Rule 7: Check return value
EXIT_CODE=$?
if [ $EXIT_CODE -ne 0 ]; then
  echo "Review failed with exit code $EXIT_CODE"
  exit 1
fi

This script applies every framework principle:

  • Bounded execution (–max-turns adapts to scope)
  • Restricted tools (read-only for reviews)
  • Pre-computed context (file list provided, not discovered)
  • Structured output (machine-parseable format requested)
  • Error handling (exit code checked)

Cost per PR review: ~20,000-40,000 tokens ($0.06-$0.12 on Sonnet 4.6). Without the framework: 60,000-120,000 tokens ($0.18-$0.36). Savings: 65-67% per automated review.

Measuring Framework Adoption

Track the framework’s impact on team costs over time:

#!/bin/bash
# scripts/framework-scorecard.sh
# Weekly scorecard for CCG framework compliance
set -uo pipefail
echo "=== CCG Framework Scorecard ==="
# Check 1: CLAUDE.md size
ROOT_CLAUDE_WORDS=$(wc -w < CLAUDE.md 2>/dev/null || echo "0")
ROOT_CLAUDE_TOKENS=$((ROOT_CLAUDE_WORDS * 100 / 75))
if [ "$ROOT_CLAUDE_TOKENS" -lt 400 ]; then
  echo "PASS: CLAUDE.md under 400 tokens (~$ROOT_CLAUDE_TOKENS)"
else
  echo "FAIL: CLAUDE.md exceeds 400 tokens (~$ROOT_CLAUDE_TOKENS)"
fi
# Check 2: Skills files exist
SKILL_COUNT=$(find .claude/skills -name "*.md" 2>/dev/null | wc -l | tr -d ' ')
if [ "$SKILL_COUNT" -ge 3 ]; then
  echo "PASS: $SKILL_COUNT skill files configured"
else
  echo "WARN: Only $SKILL_COUNT skill files (recommend 3+)"
fi
# Check 3: .claudeignore exists
if [ -f ".claudeignore" ]; then
  echo "PASS: .claudeignore configured"
else
  echo "FAIL: No .claudeignore file"
fi
# Check 4: Functions under 60 lines (P10 Rule 4)
LONG_FUNCTIONS=$(grep -rn "^function\|^export function\|^async function\|^const.*=.*=>" src/ 2>/dev/null | wc -l | tr -d ' ')
echo "INFO: $LONG_FUNCTIONS function definitions found (manual review for length)"
echo ""
echo "Run ccusage --sort cost --limit 10 to review session costs"

Example 6: Restricted Scope for Data Queries (P10 Rule 6)

P10 Rule 6: restrict the scope of data to the smallest possible. Applied to Claude Code database interactions:

# CLAUDE.md -- Data Scope Rules (P10 Rule 6)
## Database Query Rules
- All SELECT queries must include LIMIT (max: 50 rows)
- Never SELECT * from tables with >100 rows -- specify columns
- Use COUNT(*) before SELECT to check table size
- Large result sets: summarize rather than return raw data
- Export large datasets to files rather than including in conversation

Without these rules, a query returning 500 rows injects 10,000-50,000 tokens into the context window. With the rules, the same query returns 50 rows (~1,000 tokens). Over 5 database interactions per session, the scope restriction saves 45,000-245,000 tokens – $0.14-$0.74 per session at Sonnet 4.6 rates.

Framework Adoption for Teams

Rolling out the CCG framework to a team requires phased adoption. Teams that attempt to adopt all rules simultaneously often see inconsistent compliance. A proven rollout sequence:

Week 1: Bounded sessions. Add --max-turns aliases to all developer machines. This single change prevents the most expensive failures (runaway sessions) and is immediately measurable through ccusage.

Week 2: CLAUDE.md and architecture map. Create the project CLAUDE.md with the architecture map and assertion rules. This eliminates discovery bloat, the second-largest cost driver.

Week 3: Code compliance. Begin enforcing P10 rules in code reviews – functions under 60 lines, assertions on pre/post conditions. This reduces per-function comprehension cost from thousands of tokens to hundreds.

Week 4: Monitoring. Deploy the framework scorecard script and review weekly. Track average session cost, distribution of session lengths, and framework compliance percentage. The data drives the next round of optimization.

After four weeks, most teams see 30-50% reduction in average session cost, with the full 40-60% reduction achieved by week 8 as the framework becomes habitual.

The CCG Framework Connection

This is the framework. Every other article in the CCG cost optimization series implements one or more elements of the Karpathy + NASA P10 synthesis described here. Context engineering provides the “why” (tokens are money, context is finite), and NASA P10 provides the “how” (bounded loops, short functions, mandatory assertions, restricted scope). Together they produce Claude Code workflows that are safe, predictable, and cost-efficient.

Find the right skill → Browse 155+ skills in our Skill Finder.

Further Reading

Estimate tokens → Calculate your usage with our Token Estimator.

Try it: Paste your error into our Error Diagnostic for an instant fix.