MODULE_04 Foundation

Context
Engineering

~40 min active work 5 experiments 2 artifacts Prereq: Module 03

The highest-leverage AI skill. Not prompt engineering. Context engineering. Learn to decide what to include, what to cut, and where to place it so the model actually produces what you need.

What you'll be able to do

Engineer context for any AI task: decide what to include, what to cut, and where to place it

Diagnose hallucination as a context problem and fix it with specific techniques

Build XML-structured context files that make every AI interaction in your work domain sharper, faster, and more accurate

The artifacts you'll build

module-04-context-engineering/ ARTIFACT

context-file.xml - XML-structured context file for your real work domain (8 sections)
hallucination-experiment.md - before/after documentation proving context engineering works
+ session-notes.md from the Learning Extraction Prompt

The concept

The Most Valuable AI Skill You're Not Using

Every time you open a blank chat, the AI knows nothing about you.

Not your industry, not your clients, not what "good" looks like for the work you do. You're a stranger talking to someone with amnesia. So you re-explain everything. Every single time.

That's not an AI problem. That's a context problem. And it's the single biggest reason most people get mediocre output from capable models.

Andrej Karpathy put it this way: context engineering is "populating the context window with precisely the right information at exactly the right moment." The industry adopted the term. It describes the most valuable skill in this entire course.

The Window Is a Budget, Not a Warehouse

The context window is everything the model can see at once. System prompt, conversation history, attached files, your current message. All of it competes for the same fixed pool of tokens.

Here's what those pools look like right now:

Claude gives you about 200,000 tokens. GPT sits around 400,000. Gemini offers roughly 1 million. Other models claim even more, but the numbers are mostly theoretical at the high end.

Numbers keep climbing. And none of it matters as much as you think.

Here's why. Anthropic coined a term for what happens when you fill a large context window: context rot. Accuracy degrades as token count grows, and the degradation hits specific zones of the prompt harder than others. You already know the mechanism from Module 3: the attention experiments you ran showed the "lost in the middle" problem firsthand. Content at the beginning and end of your prompt gets 85-95% recall. Content buried in the middle? 76-82%.

So the race for bigger windows is mostly marketing. The real question has never been "how much fits?" It's: what's the minimum set of high-signal context that gets you the best output?

That's context engineering.

Not All Context Is Equal

The model prioritizes what it sees in a specific order:

System prompt carries the most weight. This is the persistent instruction layer, the reason Claude Projects and Custom GPTs work. Then conversation history, with recent turns weighted more heavily than older ones. Then attached files and retrieved documents. Then your current message.

One thing most people don't know: Claude Projects use RAG under the hood. When you attach files to a Project, Claude doesn't load every document into every response. It retrieves the sections that seem relevant to your current message. This means how you name, structure, and organize your files directly affects what the model actually sees. A file called "notes.md" with no headings is nearly invisible. A file called "brand-voice-guidelines.md" with clear XML sections gets retrieved when it matters.

You're using a Project for this course. So this isn't abstract.

XML, JSON, or Markdown: Pick for the Reader, Not for Yourself

You'll build a context file in this module. Before you do, you need to know which format to use. Most people pick whichever they're comfortable writing. That's backwards. Pick based on who's reading the file.

XML

For the model

Use when the AI is the primary reader

Tags like <constraints> and <voice> create hard parsing boundaries that the model treats as structural signals, not just visual decoration. You proved this in Module 3: XML tags act as attention boundaries. When the model sees content inside a named tag, it processes that content as a distinct category.

For context files, system prompts, and skill definitions (anything the model is the primary consumer of), XML is the strongest choice. That's why it's the default in this course.

JSON

For structured data

Use for data that doesn't get hand-edited

Key-value pairs, strict syntax, zero ambiguity about where a field starts and ends. The model parses JSON with near-perfect accuracy because its training data is saturated with it. Use JSON when the content is structured data that rarely gets hand-edited: style profiles, configuration objects, parameter sets, or structured outputs you feed back into other prompts.

But don't try to write a context file in JSON. You'll spend more time fighting curly braces and escaped quotes than thinking about content.

For you

Use when you're the primary reader

Headers and bullets are visual hierarchy for human eyes. The model reads them, but they don't create the same hard attention boundaries that XML does. Use Markdown when humans are the primary audience and the model just happens to read it too.

That's why my-knowledge.md is Markdown: you read that file more than the model does.

Here's the decision in one line: if the model is the main reader, XML. If you are, Markdown. If neither of you needs to "read" it but both need to reference structured data, JSON.

Hallucination Isn't Random

Most people think hallucination is the AI being stupid or making things up. Neither is accurate. The model predicts the most probable next token given its context. When context is missing, "most probable" and "actually true" stop meaning the same thing.

Two pieces of research back this up. Anthropic's 2025 interpretability work found circuits inside Claude that suppress responses when information is insufficient. Hallucinations happen when suppression misfires: the model recognizes a pattern but lacks the real data, so it generates something plausible and serves it with full confidence. OpenAI's 2025 research showed the other side: training benchmarks reward confident guesses over "I don't know," so models learned to always guess.

The fix isn't hoping for smarter models. When you give the model the right context for the task, hallucination rates collapse. RAG-based systems cut hallucinations by roughly 70%. The best models now hit sub-1% hallucination rates on factual tasks when grounded in retrieved documents. The fix is engineering better context.

Three Decisions, Every Time

Context engineering comes down to three choices you make before you hit enter:

What to include. What to exclude. Where to place it.

More isn't better. Context rot proved that. The goal is the smallest set of high-signal tokens that produces the output you need. And the experiments in this module's prompt will make that real in a way that reading about it can't.

The prompt

Context Engineering Lab

For shortcut seekers: paste the prompt and go. Phase 1 re-establishes everything you need. For deep investors: reading the concept section first means you'll predict what happens in each experiment before it happens, which hits different when your prediction lands (or doesn't).

Before you paste

✓

Make sure you're in your course project

✓

Your my-context.md, my-learning-style.md, and my-knowledge.md files should be attached to the project

✓

Have a real work task ready - something you've used AI for and gotten mediocre results. A client deliverable, a report format, a recurring task. Not a hypothetical. The thing on your desk right now.

✓

After the exercise, run the Learning Extraction Prompt to update your knowledge file and save your session notes

What happens in each phase

context-engineering-lab.xml - 5 phases

PHASE 1 Context Bridge: map the gap between what you know about your task and what the AI actually sees

PHASE 2 Context Window Experiment: same task, three runs (bare, overloaded, curated). You'll watch context rot happen live

PHASE 3 Hallucination Diagnosis: trigger a hallucination on purpose, then fix it with context, then test the boundary

PHASE 4 Hierarchy & Placement: move a critical instruction to 3 positions, test RAG retrieval in your Project files

PHASE 5 Build Your Context File: 8-section XML document for your real work domain, tested against your task from Phase 1

context-engineering-lab.xml

After this module

Save Your Work

Run the Learning Extraction Prompt to update my-knowledge.md with what you learned.

Save to module-04-context-engineering/

✓

context-file.xml - your XML-structured context file for your work domain (all 8 sections)

✓

hallucination-experiment.md - the fabricated output, the context you added, the corrected output

✓

session-notes.md - from the Learning Extraction Prompt

✓

Run the Learning Extraction Prompt in the same conversation

Next: Module 5: From Conversations to Systems.
You've built a context file for your work domain. Now you'll build the system that loads it automatically: a system prompt, a configured workspace, and reusable skills - so you never start from a blank chat again.

Learning Extraction Prompt

Run This After Every Module

After completing the module prompt above, paste this into the same conversation. The AI reviews everything that just happened and extracts what you actually learned, not what was presented, but what you demonstrated.

learning-extraction-prompt.xml

ContextEngineering

The Most Valuable AI Skill You're Not Using

Context Engineering Lab

Save Your Work

Run This After Every Module

Context
Engineering