The highest-leverage AI skill. Not prompt engineering. Context engineering. Learn to decide what to include, what to cut, and where to place it so the model actually produces what you need.
What you'll be able to do
Engineer context for any AI task: decide what to include, what to cut, and where to place it
Diagnose hallucination as a context problem and fix it with specific techniques
Build XML-structured context files that make every AI interaction in your work domain sharper, faster, and more accurate
The artifacts you'll build
The concept
Every time you open a blank chat, the AI knows nothing about you.
Not your industry, not your clients, not what "good" looks like for the work you do. You're a stranger talking to someone with amnesia. So you re-explain everything. Every single time.
That's not an AI problem. That's a context problem. And it's the single biggest reason most people get mediocre output from capable models.
Andrej Karpathy put it this way: context engineering is "populating the context window with precisely the right information at exactly the right moment." The industry adopted the term. It describes the most valuable skill in this entire course.
The Window Is a Budget, Not a Warehouse
The context window is everything the model can see at once. System prompt, conversation history, attached files, your current message. All of it competes for the same fixed pool of tokens.
Here's what those pools look like right now:
Claude gives you about 200,000 tokens. GPT sits around 400,000. Gemini offers roughly 1 million. Other models claim even more, but the numbers are mostly theoretical at the high end.
Numbers keep climbing. And none of it matters as much as you think.
Here's why. Anthropic coined a term for what happens when you fill a large context window: context rot. Accuracy degrades as token count grows, and the degradation hits specific zones of the prompt harder than others. You already know the mechanism from Module 3: the attention experiments you ran showed the "lost in the middle" problem firsthand. Content at the beginning and end of your prompt gets 85-95% recall. Content buried in the middle? 76-82%.
So the race for bigger windows is mostly marketing. The real question has never been "how much fits?" It's: what's the minimum set of high-signal context that gets you the best output?
That's context engineering.
Not All Context Is Equal
The model prioritizes what it sees in a specific order:
System prompt carries the most weight. This is the persistent instruction layer, the reason Claude Projects and Custom GPTs work. Then conversation history, with recent turns weighted more heavily than older ones. Then attached files and retrieved documents. Then your current message.
You're using a Project for this course. So this isn't abstract.
XML, JSON, or Markdown: Pick for the Reader, Not for Yourself
You'll build a context file in this module. Before you do, you need to know which format to use. Most people pick whichever they're comfortable writing. That's backwards. Pick based on who's reading the file.
For the model
Tags like <constraints> and <voice> create hard parsing boundaries that the model treats as structural signals, not just visual decoration. You proved this in Module 3: XML tags act as attention boundaries. When the model sees content inside a named tag, it processes that content as a distinct category.
For context files, system prompts, and skill definitions (anything the model is the primary consumer of), XML is the strongest choice. That's why it's the default in this course.
For structured data
Key-value pairs, strict syntax, zero ambiguity about where a field starts and ends. The model parses JSON with near-perfect accuracy because its training data is saturated with it. Use JSON when the content is structured data that rarely gets hand-edited: style profiles, configuration objects, parameter sets, or structured outputs you feed back into other prompts.
But don't try to write a context file in JSON. You'll spend more time fighting curly braces and escaped quotes than thinking about content.
For you
Headers and bullets are visual hierarchy for human eyes. The model reads them, but they don't create the same hard attention boundaries that XML does. Use Markdown when humans are the primary audience and the model just happens to read it too.
That's why my-knowledge.md is Markdown: you read that file more than the model does.
Here's the decision in one line: if the model is the main reader, XML. If you are, Markdown. If neither of you needs to "read" it but both need to reference structured data, JSON.
Hallucination Isn't Random
Most people think hallucination is the AI being stupid or making things up. Neither is accurate. The model predicts the most probable next token given its context. When context is missing, "most probable" and "actually true" stop meaning the same thing.
Two pieces of research back this up. Anthropic's 2025 interpretability work found circuits inside Claude that suppress responses when information is insufficient. Hallucinations happen when suppression misfires: the model recognizes a pattern but lacks the real data, so it generates something plausible and serves it with full confidence. OpenAI's 2025 research showed the other side: training benchmarks reward confident guesses over "I don't know," so models learned to always guess.
The fix isn't hoping for smarter models. When you give the model the right context for the task, hallucination rates collapse. RAG-based systems cut hallucinations by roughly 70%. The best models now hit sub-1% hallucination rates on factual tasks when grounded in retrieved documents. The fix is engineering better context.
Three Decisions, Every Time
Context engineering comes down to three choices you make before you hit enter:
What to include. What to exclude. Where to place it.
More isn't better. Context rot proved that. The goal is the smallest set of high-signal tokens that produces the output you need. And the experiments in this module's prompt will make that real in a way that reading about it can't.
The prompt
For shortcut seekers: paste the prompt and go. Phase 1 re-establishes everything you need. For deep investors: reading the concept section first means you'll predict what happens in each experiment before it happens, which hits different when your prediction lands (or doesn't).
Before you paste
Make sure you're in your course project
Your my-context.md, my-learning-style.md, and my-knowledge.md files should be attached to the project
Have a real work task ready - something you've used AI for and gotten mediocre results. A client deliverable, a report format, a recurring task. Not a hypothetical. The thing on your desk right now.
After the exercise, run the Learning Extraction Prompt to update your knowledge file and save your session notes
What happens in each phase
PHASE 1 Context Bridge: map the gap between what you know about your task and what the AI actually sees
PHASE 2 Context Window Experiment: same task, three runs (bare, overloaded, curated). You'll watch context rot happen live
PHASE 3 Hallucination Diagnosis: trigger a hallucination on purpose, then fix it with context, then test the boundary
PHASE 4 Hierarchy & Placement: move a critical instruction to 3 positions, test RAG retrieval in your Project files
PHASE 5 Build Your Context File: 8-section XML document for your real work domain, tested against your task from Phase 1
After this module
Run the Learning Extraction Prompt to update my-knowledge.md with what you learned.
Save to module-04-context-engineering/
context-file.xml - your XML-structured context file for your work domain (all 8 sections)
hallucination-experiment.md - the fabricated output, the context you added, the corrected output
session-notes.md - from the Learning Extraction Prompt
Run the Learning Extraction Prompt in the same conversation
Learning Extraction Prompt
After completing the module prompt above, paste this into the same conversation. The AI reviews everything that just happened and extracts what you actually learned, not what was presented, but what you demonstrated.