Lucía is a lawyer at a mid-size firm. Administrative law. Dense appeals, case files that arrive in chunks, clients who want to fight every point. One morning she gets a heavy case: 60 pages of administrative documentation, previous resolutions, technical reports, and an email thread where no two messages tell the same story.
She opens a Claude conversation and does what seems logical: pastes all the documentation at the top and starts working. She sets the tone — precise, sober, no embellishment. She defines the legal strategy. She marks which arguments are central and which are secondary. For almost an hour, the conversation flows. The responses are good. The proposals make sense.
At message number 26, something shifts. Claude starts using expressions she had explicitly forbidden at the start. It reframes a core argument from a different angle. In one response, it even seems to contradict a clear instruction Lucía gave early on. Her first reaction: the AI has become inconsistent. Something broke.
Nothing broke. The conversation simply no longer fit on the table.
What the context window actually is (and why it forgets)
The context window in Claude is like a desk. On it you can fit papers, folders, sticky notes, your laptop, maybe a coffee. As long as everything fits and stays visible, you can work with clarity. You know what matters, what’s pending, and what’s already been decided.
Now imagine you keep adding documents without removing anything. Eventually the desk is still the same size, but the material no longer fits. You start covering things up. The last thing you placed sits on top. The first thing you put down disappears under a pile.
Every Claude conversation has a context window: a limited space that holds everything the model is considering when it responds to you. It’s not long-term memory. It’s working memory. While the conversation fits inside the window, Claude remembers instructions, tone, decisions, and relevant data. When it fills up, there’s no alarm. Claude simply starts prioritizing what’s most recent.
It doesn’t «forget» all at once. It stops seeing what got buried.
Tokens, explained without the drama
To understand the size of that desk, there’s a word that sounds more intimidating than it is: token. A token is not exactly a word and not exactly a letter. It’s a unit of text. For practical purposes:
- A written page of text usually runs around 500 tokens.
- An 80-page PDF can take up roughly 40,000 tokens.
You don’t need to count them or optimize them like calories. Just understand they exist and the space is finite. Each model has a different window: Opus can sustain more context than Sonnet, and Sonnet more than Haiku. But none of them remembers «everything forever» within a single conversation.
When Lucía pasted 60 pages at the start and then kept a long conversation going, she didn’t do anything wrong. She just filled the table very quickly.
Signs your conversation is saturated
When the context window fills up, Claude starts doing something very human: it pays attention to whatever you said most recently. The consequences are clear:
- Rules you set early on start to fade.
- Tone can drift without you noticing.
- Small contradictions appear.
- Claude repeats ideas that were already settled.
There are reliable symptoms: Claude ignores a clear rule you gave earlier, returns a correct answer that no longer fits previous decisions, or asks about something that was already resolved. When you see two or three of these signals together, don’t push harder. It’s not a prompt problem. It’s a space problem.
Three fixes that work without starting over
1. Summarize and reopen (the most effective)
Ask Claude to produce a structured summary of the conversation: facts, decisions, and active rules. Take that summary and start a new conversation. The summary acts as a clean desk with only the essentials. You’re not starting over. You’re continuing with order.
2. Reorder the important rules
If you can’t close the conversation yet, move what’s critical to the end. Important instructions work better when they’re close to the last question — not because they matter more, but because they’re still visible within the window. Don’t repeat everything. Reaffirm what’s essential.
3. Chunk before you work
Instead of pasting a massive PDF and working on top of it, change the order: first ask for a summary, then decide which parts are relevant, and only then go deep. This way you use the window to think, not to store unfiltered text.
Closing a conversation well is also work
One of the most productive habits is knowing when to close. When a conversation has served its purpose, don’t stretch it «just in case.» Extract what matters, save the state, and open a new one when it’s time. Working with AI isn’t about maintaining an eternal thread. It’s about managing sessions. You don’t notice this on day one. You notice it after ten long projects where none of them spiraled into chaos.
Claude’s context window isn’t an annoying limitation. It’s a design feature that, when managed well, forces you to be more precise and organized in how you delegate work. Paradoxically, that improves your results.
This is just a taste. The full book shows you how to turn AI into your most productive team member.
📖 Your Digital Employee
Claude and AI as your best collaborator
