Skip to main content

Prompt, Memory & Context

Everything the model sees when it processes a turn — the system prompt, your project context, persistent memory, and how it all fits within the context window.

System Prompt Assembly

Before every API call, the Agent Loop assembles a system prompt from 7 layers. This isn't a static string — it's rebuilt dynamically each turn:

system_promptassembled before every API call
Always Included
1
Base Prompt
Claude's identity, role, tone, and behavior rules
~3,000
src/constants/prompts.ts
2
Tool Definitions
Name, description, and JSON Schema for each of 45+ tools
~4,000
src/tools/
Context-Dependent
3
Permission Context
Current mode (ask/auto), allowed/denied patterns, policy limits
~500
src/utils/permissions/
4
CLAUDE.md Content
Merged layers: enterprise → global → project → local
varies
src/utils/claude-md/
5
System Context
Git status, OS info, working directory, shell, IDE
~800
src/utils/environment/
When Active
6
Plugin Prompts
Custom instructions injected by loaded plugins
varies
src/services/plugins/
7
Active Skill Content
Prompt template from a /skill command, if invoked
varies
src/skills/

1. Base Prompt

The foundation — Claude's identity, tone, behavior rules, and general instructions. Same for every user and project.

View the actual base prompt sections

Intro:

You are an interactive agent that helps users with software engineering tasks. Use the instructions below and the tools available to you to assist the user.

# System — Rules about tool execution, permissions, tags, prompt injection detection, and context compression.

# Doing tasks:

  • Do not propose changes to code you haven't read.
  • Do not create files unless absolutely necessary.
  • Don't add features, refactor code, or make "improvements" beyond what was asked.
  • Don't add error handling for scenarios that can't happen.
  • Don't create helpers or abstractions for one-time operations.

# Executing actions with care:

Carefully consider the reversibility and blast radius of actions. For actions that are hard to reverse or affect shared systems, check with the user before proceeding.

# Using your tools — Prefer dedicated tools over Bash (Read instead of cat, Edit instead of sed).

# Output efficiency:

Go straight to the point. Try the simplest approach first. Be extra concise.

Source: src/constants/prompts.ts

2. Tool Definitions

For each enabled tool, the model receives a name, description, and JSON Schema.

Example: Bash tool schema
{
"name": "Bash",
"description": "Executes a given bash command and returns its output.",
"input_schema": {
"type": "object",
"properties": {
"command": { "type": "string", "description": "The command to execute" },
"description": { "type": "string", "description": "What this command does" },
"timeout": { "type": "number", "description": "Timeout in ms (max 600000)" },
"run_in_background": { "type": "boolean", "description": "Run in background" }
},
"required": ["command"]
}
}

See Tool Definitions in the Appendix for the source-backed tool reference.

3. Permission Context

Current permission mode, allow/deny rules, and enterprise policy limits. This shapes the model's behavior — it won't suggest tools it knows will be denied.

4. CLAUDE.md Content

Your project-specific instructions, merged from up to 4 file layers.

Example: A typical project CLAUDE.md
# Project: my-saas-app

## Architecture
- Monorepo with Next.js frontend and Express API
- PostgreSQL database with Prisma ORM
- Tests use Vitest, not Jest

## Conventions
- Use `pnpm`, not `npm` or `yarn`
- All API routes go in `src/api/routes/`
- Use snake_case for database columns, camelCase for TypeScript

## Key Files
- `src/api/routes/index.ts` — API route registry
- `prisma/schema.prisma` — Database schema

See CLAUDE.md Layers below for the full merge system.

5. System Context

Runtime environment info: git status, OS, shell, working directory, IDE.

Example: What gets injected
# Environment
- Primary working directory: /Users/you/my-project
- Is a git repository: true
- Platform: darwin
- Shell: zsh
- OS Version: Darwin 24.1.0

gitStatus: Current branch: feature/auth-fix
Status: M src/auth.ts

6. Plugin Prompts

Loaded plugins can inject custom instructions.

Example: Plugin prompt injection
# Plugin: company-standards

When writing code for this organization:
- Always add copyright headers to new files
- Use the internal logging library instead of console.log
- All API endpoints must have OpenAPI annotations

7. Skill Content

When you invoke /commit or /review-pr, its prompt template is injected.

Example: The /commit skill prompt
Only create commits when requested by the user. Follow these steps:
1. Run git status and git diff to see all changes
2. Run git log to see recent commit message style
3. Draft a concise commit message focusing on "why" not "what"
4. Stage relevant files and create the commit
5. Run git status after to verify success

Important:
- NEVER push to remote unless explicitly asked
- NEVER use git commands with -i flag (interactive)

CLAUDE.md Layer System

Claude Code assembles persistent context from multiple CLAUDE.md files on disk:

🏢
Enterprise / Managed CLAUDE.md
🌐
~/.claude/CLAUDE.md (Global)
📁
project-root/CLAUDE.md (Project)
💻
.claude/CLAUDE.md (Local)
🔀
Merged CLAUDE.md Content
📝
Injected into System Prompt (layer 4)
LayerLocationScopeTypical Content
EnterpriseManaged by org adminOrganization-widePolicies, compliance rules
Global~/.claude/CLAUDE.mdAll projectsPersonal preferences, coding style
ProjectCLAUDE.md in repo rootThis projectArchitecture, conventions, key files
Local.claude/CLAUDE.mdThis machine + projectLocal setup notes

Layers are concatenated in order (enterprise first, local last). Later layers can override or supplement earlier ones.

Auto Memory

Claude automatically stores useful information learned during conversations in ~/.claude/projects/<project-hash>/memory/:

TypeWhat's StoredExample
userRole, preferences, expertise"Senior Go engineer, new to React"
feedbackCorrections and confirmed approaches"Don't mock the database in tests"
projectOngoing work, goals, deadlines"Merge freeze begins 2026-03-05"
referencePointers to external resources"Pipeline bugs in Linear project INGEST"
Memory file format
---
name: user_role
description: User is a data scientist focused on logging
type: user
---

User is a data scientist investigating observability and logging infrastructure.

An index file (MEMORY.md) maintains pointers to all memory files.

What's NOT stored: Code patterns (read the codebase), git history (use git log), debugging solutions (the fix is in the code), anything already in CLAUDE.md.

Session Memory

During a conversation, transient state is tracked in memory:

  • Conversation history — all messages exchanged
  • Tool results — outputs from tool executions
  • Task state — background tasks and their status
  • Compaction summaries — condensed history from compaction passes

Session memory is lost when the conversation ends, unless explicitly saved to auto-memory.

Session Memory Sidecar vs Durable Memory

The source splits "memory" into at least three different mechanisms:

MechanismScopeHow It Works
Session transcript stateCurrent conversation onlyIn-memory messages, tool results, tasks, and compaction summaries
Session memory fileCurrent conversation onlyA background forked agent periodically refreshes a markdown sidecar file after token and tool-call thresholds are met
Durable auto-memoryAcross future sessionsextractMemories runs at the end of a completed loop and writes reusable memory files under the auto-memory directory

The important distinction is that session memory is a rolling sidecar for the current chat, while extracted memories are durable files that can be injected into later prompt assembly.

Cross-Environment and Shared Memory Services

Several services change which settings and memory files exist on disk before prompt assembly runs:

ServiceScopeSource-Backed Behavior
Settings SyncPer user across environmentsInteractive CLI uploads changed settings and memory entries; CCR downloads them before plugin setup
Remote Managed SettingsEnterprise / managed accountsEligible users get remote settings loading, checksum validation, and background polling
Team Memory SyncPer repo across org membersPull overwrites local files, push uploads deltas, and deletions do not propagate

These are not separate prompt layers. They are upstream file and settings services that change what the later prompt-assembly layers read.

Context Window & Compaction

Every model has a fixed context window (e.g., 200K tokens). The system prompt alone consumes ~5%. As conversation grows, the engine must keep everything within budget:

context_window200,000 tokens
~10K
Conversation History
Reserve
Available
System Prompt (~5%)
Messages + Tool Results
Reserved for Response
Available Space

When conversation history exceeds the available space, compaction kicks in — older messages are summarized to free up room. Click between "Before" and "After" to see how it works:

Before
After Compaction
YouSet up the auth module with OAuth2
ClaudeI'll create the auth module. Let me read the config first.
Readsrc/config.ts → 85 lines
Writesrc/auth/oauth.ts → created (142 lines)
Writesrc/auth/middleware.ts → created (67 lines)
Editsrc/app.ts → added auth middleware
Bashnpm test → 12 passed, 0 failed
ClaudeAuth module created with OAuth2 support. Tests pass.
YouNow add Google and GitHub as providers
ClaudeI'll add both providers. Reading the current oauth module...
Readsrc/auth/oauth.ts → 142 lines
Editsrc/auth/oauth.ts → added Google provider
Editsrc/auth/oauth.ts → added GitHub provider
Bashnpm test → 18 passed, 0 failed
ClaudeAdded Google and GitHub OAuth providers. All tests pass.
↑ 15 messages consuming ~8,000 tokens ↑
Microevery turn — strip empties
Autoapproaching limit — summarize old
Fullat the limit — aggressive summary

What's Preserved

Even during aggressive compaction, the engine always keeps:

  • Your most recent messages (verbatim)
  • Tool results that are still being referenced
  • Key decisions and explicit user instructions

How Other Frameworks Compare

Claude Code's 7-layer dynamic assembly is unusually comprehensive:

FrameworkAssemblyTool InjectionCustom ContextExtensibility
Claude Code7-layer dynamicAutoFile-based (CLAUDE.md)Plugins + Skills
Google ADK4-layerManualTemplate variablesGlobalInstructionPlugin
OpenAI AgentsString, callable, or prompt objectManualPrompt/context objectsPrompt objects + dynamic prompt functions
LangChainMiddleware compositionManualVia middlewareMiddleware stacking
LangGraphString or callableManualNone built-inPre/post hooks
What each framework does differently

Google ADK is the closest — global instructions plus static and dynamic agent instructions, with template/state substitution during assembly. But no auto tool injection or file discovery.

OpenAI Agents SDK — instructions can be a string/callable, and agents can also use prompt objects or dynamic prompt functions. Still much flatter than Claude Code's layered assembly, with no auto-injection.

LangChain — middleware approach via wrap_model_call. Flexible but nothing built-in.

LangGraph — single string/callable prompt. Focus is on graph structure, not prompt composition.

SDK Mode

The prompt assembly also powers headless mode (QueryEngine.ts) for the Claude Agent SDK and sub-agents:

const engine = new QueryEngine({ model, tools, systemPrompt })
const result = await engine.query(messages)
# Python equivalent
from claude_agent_sdk import query

async for message in query(prompt="Fix the bug"):
print(message)

Magic Docs

Magic Docs is a narrow background maintenance system for markdown files that opt in with a # MAGIC DOC: header.

📄
User or agent reads a markdown file
🏷️
Detect # MAGIC DOC: header + optional italic instructions
📌
Register file as a tracked Magic Doc
🔁
Post-sampling hook runs after later turns
🤖
Forked update agent edits only that doc

The implementation in src/services/MagicDocs/magicDocs.ts is intentionally narrow:

BehaviorSource-Backed Detail
Opt-in formatThe first line must match # MAGIC DOC: ...; the next line may contain italicized instructions
RegistrationThe file is tracked only after it has been read during the session
Update agentUses a dedicated magic-docs agent definition with model sonnet
Tool permissionsThe forked agent is restricted to Edit, and only for that exact markdown path
UntrackingIf the doc is deleted, unreadable, or the header is removed, it is dropped from tracking

Magic Docs sits closer to memory maintenance than to normal editing. It is a post-sampling background hook that tries to keep long-lived project notes current as the conversation uncovers new information.

Dream (Memory Consolidation)

Dream is a background process that periodically reviews recent session transcripts and consolidates learnings into the memory directory — like the system "sleeping" to organize what it's learned.

Gate Sequence

Dream runs as a forked sub-agent, but only after passing a series of gates (cheapest checks first):

🔍
Pre-checks: not Kairos, not remote, auto-memory enabled
⚙️
isAutoDreamEnabled() — settings.json or GrowthBook
Time gate: 24+ hours since last consolidation
🔄
Scan throttle: 10+ minutes since last session scan
📋
Session gate: 5+ sessions since last (excluding current)
🔒
Lock: no other process consolidating (PID-verified)
💤
Fire — spawn forked agent with consolidation prompt
src/services/autoDream/autoDream.ts (simplified)
const DEFAULTS = {
minHours: 24, // Hours between consolidation runs
minSessions: 5, // Minimum sessions to trigger
}

Consolidation Phases

The forked agent receives a structured prompt with 4 phases:

Phase 1 — Orient
📂
Read existing memories and index
Understand current state before changing anything
Phase 2 — Gather
🔍
Check daily logs → drifted memories → narrow transcript grep
Not exhaustive — look only for what matters
Phase 3 — Consolidate
✏️
Write or update memory files
Merge into existing topics, fix contradictions, convert relative dates
Phase 4 — Prune
✂️
Update index, remove stale entries, enforce size limits
Keep index under ~25KB, one line per entry

The agent's Bash access is restricted to read-only commands (ls, find, grep, cat, stat, wc, head, tail). It can read and write memory files but cannot execute arbitrary commands.

Lock Coordination

A file-based lock (.consolidate-lock in the memory directory) prevents multiple processes from consolidating simultaneously:

PropertyValue
Lock file mtimeTimestamp of last consolidation
Lock file bodyPID of the holder process
Stale detectionDead PID or lock >1 hour old → reclaim
Failure rollbackRewind mtime so time gate passes again

Configuration

SettingWhereDefault
autoDreamEnabledsettings.jsonFollows GrowthBook
tengu_onyx_plover.enabledGrowthBookfalse
tengu_onyx_plover.minHoursGrowthBook24
tengu_onyx_plover.minSessionsGrowthBook5

Dream creates a DreamTask (task type dream) that appears in the background tasks list, showing progress as the agent works through the phases. Users can also trigger consolidation manually via the /dream slash command, which stamps the lock file but runs with normal (non-restricted) permissions.

Key Source Files

FilePurpose
src/query.tsSystem prompt assembly + query loop (~1,900 lines)
src/QueryEngine.tsHeadless query execution for SDK/sub-agents
src/constants/prompts.tsBase system prompt templates
src/utils/claude-md/CLAUDE.md loading and merging
src/memdir/Memory directory management
src/services/SessionMemory/Session-memory sidecar file generation
src/services/extractMemories/Durable memory extraction at end of completed turns
src/services/teamMemorySync/Shared repo-level memory synchronization
src/services/settingsSync/Cross-environment settings and memory sync
src/services/MagicDocs/Background maintenance for # MAGIC DOC: markdown files
src/services/compact/Compaction algorithms
src/services/autoDream/Dream/auto-consolidation system
src/tasks/DreamTask/Dream task tracking