Claude Code's Quality Drop Was Not Imaginary: 3 Causes from Anthropic's Official Postmortem¶
For / Key Points
For: Developers and AI-coding beginners who felt Claude Code became less reliable or drained usage faster after March 2026.
Key Points:
- On April 23, 2026, Anthropic explained three changes that degraded Claude Code quality for some users
- The issue was not an intentional downgrade of the base model, but product-layer changes around reasoning, caching, and system prompts
- Earlier reports about token drain were directionally correct, but some root-cause claims should now be tightened against the official postmortem
From March to April 2026, many Claude Code users reported the same pattern: Claude stopped reading enough code, made shallow edits, repeated itself, and burned through usage limits faster than expected.
That was not just imagination. On April 23, 2026, Anthropic published an engineering postmortem explaining three separate issues that affected Claude Code, Claude Agent SDK, and Claude Cowork, with fixes completed by April 201.
This article answers one question: what actually happened to Claude Code?
What Anthropic Officially Acknowledged¶
Anthropic acknowledged that the Claude Code experience worsened for some users.
The company also said the Claude API and inference layer were not affected1. In other words, the official story is not "Anthropic secretly made the model dumber." It is "the product layer around the model changed in ways that made Claude Code behave worse."
For beginners, think of it like a car. The engine may be the same, but if the accelerator mapping, navigation memory, and driving rules all change at once, the car feels very different.
| Official cause | What changed | How users felt it |
|---|---|---|
| Lower reasoning effort | Default changed from high to medium | Shallower reasoning |
| Thinking-history bug | Older reasoning was repeatedly dropped | Forgetfulness and repetition |
| Brevity prompt | Responses were pushed too short | Less explanation and weaker coding quality |
The rest of the story is how those three changes stacked.
Cause 1: The Default Reasoning Effort Was Lowered¶
The first issue was a lower default "thinking" setting.
Claude Code has an effort setting. It is basically a speed-versus-depth dial: lower effort can be faster and cheaper, while higher effort gives Claude more room to reason through hard tasks.
On March 4, 2026, Anthropic changed Claude Code's default reasoning effort from high to medium. The reason was practical: high effort sometimes caused very long thinking latency, making the UI appear frozen and increasing usage-limit pressure1.
In hindsight, Anthropic says this was the wrong tradeoff. Users preferred higher intelligence by default and lower effort as an opt-in choice. Anthropic says it reversed the decision on April 71.
The changelog shows that the rollback was also staged across account types. Version 2.1.94 changed the default from medium to high for API-key, Bedrock, Vertex, Foundry, Team, and Enterprise users. Version 2.1.117 later changed Pro and Max subscribers on Opus 4.6 and Sonnet 4.6 back to high as well2.
The simple version: for a while, Claude Code was defaulting to less thinking than many users expected.
Cause 2: A Bug Dropped Claude's Prior Thinking¶
The second issue was the most important one. Claude Code was losing its own working memory.
When Claude Code works through a coding task, it reads files, edits code, runs tests, and continues across many turns. The important part is not just the chat history; it is the reasoning history explaining why earlier edits and tool calls happened.
On March 26, 2026, Anthropic shipped an optimization for sessions that had been idle for more than an hour. The idea was to clear older thinking once when resuming a stale session, reducing latency and uncached token cost1.
The implementation did something worse. Instead of clearing older thinking once, it kept clearing thinking on every later turn in that process. Anthropic says this made Claude continue working while increasingly losing the reasoning for its previous choices1.
That maps closely to what users saw: repeated investigations, strange tool choices, forgotten decisions, and edits that no longer followed the earlier plan.
It also connects directly to token drain. Anthropic says the repeated dropping of thinking blocks likely caused cache misses1. A cache miss means prior context cannot be reused cheaply and has to be processed again as fresh input. That likely drove separate reports of usage limits draining faster than expected.
So "quality got worse" and "usage drained faster" were not necessarily separate incidents. In some cases, they were two symptoms of the same broken context-management path.
Cause 3: A Brevity Prompt Hurt Coding Quality¶
The third issue was a system prompt change that pushed Claude Code to be shorter.
On April 16, Anthropic added an instruction intended to reduce verbosity. The motivation was understandable: Opus 4.7 tended to produce more output, which can increase token use1.
But coding is not always improved by shorter answers. Multi-file changes often need enough explanation to preserve assumptions, verification steps, and design tradeoffs.
Anthropic later ran broader ablations and found one evaluation showing a 3% drop for both Opus 4.6 and Opus 4.7. The problematic prompt instruction was reverted in the April 20 release1.
The lesson is plain: shorter output is not automatically better output for AI coding.
What the User-Side Reports Showed¶
Before the official postmortem, users had already published detailed reports. The official postmortem explains the product-side causes, but the user reports help show what those failures felt like in real coding sessions.
The most visible one was GitHub Issue #42796, opened on April 2, 2026. The report analyzed 6,852 Claude Code session files, 17,871 thinking blocks, and 234,760 tool calls. It argued that Claude shifted from research-first behavior toward edit-first behavior3.
Not every claim in that report is officially verified. In particular, estimating "thinking depth" from local logs can be complicated because visible thinking may be redacted even when internal reasoning still occurs.
Still, the reported symptoms line up strongly with Anthropic's postmortem:
- Editing before reading enough context
- Forgetting earlier decisions
- Repeating the same work
- Burning usage faster than expected
- Choosing the simplest-looking patch too early
There was also an important Hacker News exchange. After reading submitted feedback IDs, Claude Code lead Boris Cherny said the data pointed to adaptive thinking under-allocating reasoning on some turns4. Adaptive thinking is the mechanism that adjusts reasoning budget from turn to turn. This is not the center of the April 23 official postmortem, but it helps explain why some users felt high effort alone did not fully restore earlier behavior.
The right reading is careful: separate confirmed official causes from user-side hypotheses.
How This Updates the Earlier Token-Drain Article¶
This section is reviewing the earlier SmartScope article, “Why Claude Code Burns Through Tokens So Fast”, which argued that Claude Code token drain was tightly connected to prompt-cache behavior.
That core idea has aged well. Anthropic's postmortem explicitly connects the thinking-history bug to cache misses and faster usage-limit drain1. GitHub Issue #34629 also documented a reproducible --print --resume cache regression where cache_read stopped growing and conversation history was effectively recreated each turn5.
The part to tighten is root-cause certainty.
The earlier article discussed the March 31 Claude Code source leak and possible mechanisms inferred from the leaked code, such as attestation or anti-distillation. The leak itself was real and acknowledged by Anthropic as a release-packaging error, with no customer data or credentials exposed6.
But those leaked-code mechanisms are not the official root cause of the April 23 postmortem. The official explanation points to the clear_thinking optimization and its repeated removal of prior reasoning.
A safer updated statement is:
The earlier article correctly identified that broken cache behavior can make Claude Code usage drain abnormally fast. However, leaked-code root-cause candidates should be treated as hypotheses, while Anthropic's April 23 postmortem identifies the thinking-history clearing bug as the confirmed product-layer cause.
That makes the older article a useful lead-in, not something to discard.
What Claude Code Users Should Check Now¶
The first things to check are version and effort level.
Anthropic says all three issues were resolved as of April 20 in v2.1.1161. The changelog also records later effort-default changes for Pro and Max subscribers in v2.1.1172.
In practice, use this checklist:
- Update Claude Code to at least v2.1.116 or later
- Check
/effortand usehighor above for complex coding work - Be suspicious of old stale sessions if the behavior feels inconsistent
- Record evidence with
/usage,/bug, and reproducible steps instead of relying only on vibes
The goal is not to trust Claude Code blindly or abandon it completely. The goal is to know which layer is failing.
Summary¶
The most important lesson is that model quality and AI-coding-product quality are not the same thing.
Claude Code is a product made of a model, a CLI, tool execution, file editing, session management, prompt caching, and system prompts. A small shift in any one of those layers can make users feel that the model itself got worse.
This is not unique to Anthropic. Codex, Cursor, GitHub Copilot, Gemini CLI, and other coding agents all depend on a harness around the model. Here, "harness" means the surrounding execution environment: CLI behavior, tools, session handling, prompts, and release logic.
The next time an AI coding tool suddenly feels smarter or worse, ask four questions:
- Did the model version change?
- Did the effort or thinking behavior change?
- Did caching or session resume break?
- Did the tool harness or system prompt change?
Claude Code's quality drop was real. It was not just vibes. But it also was not a simple conspiracy story. It was a production-quality failure in the layers that turn a model into a daily development tool.
Related Articles¶
- Why Claude Code Burns Through Tokens So Fast
- Claude Code x Codex CLI Review Loop Automation
- Codex CLI vs Claude Code 2026 Benchmark
Anthropic Engineering, An update on recent Claude Code quality reports, April 23, 2026. ↩↩↩↩↩↩↩↩↩↩↩
Claude Code Docs, Changelog, v2.1.94 / v2.1.116 / v2.1.117. ↩↩
GitHub, anthropics/claude-code, [MODEL] Claude Code is unusable for complex engineering tasks with the Feb updates #42796, April 2, 2026. ↩
Hacker News, Boris Cherny comment on adaptive thinking under-allocation, April 2026. ↩
GitHub, anthropics/claude-code, [BUG] Prompt cache regression in --print --resume since v2.1.69(?) #34629, March 15, 2026. ↩
Axios, Anthropic leaked 500,000 lines of its own source code, March 31, 2026; BleepingComputer, Claude Code source code accidentally leaked in NPM package, March 31, 2026. ↩