Back to Blog
I Analyzed Claude Code 4 Times From the Outside. The Source Leak Proved Me Right and Wrong.

I Analyzed Claude Code 4 Times From the Outside. The Source Leak Proved Me Right and Wrong.

512,000 lines of TypeScript leaked from a missing .npmignore. The architecture I reverse-engineered was spot-on. The implementation quality made me rethink what 'good code' means at $2.5B ARR.

AIDeveloper ToolsEngineeringClaude
April 2, 2026
9 min read

A missing .npmignore entry. That's what it took to expose 512,000 lines of TypeScript, 44 feature flags, internal model codenames, and a hidden autonomous agent mode that nobody outside Anthropic was supposed to see.

On March 31, Claude Code v2.1.88 shipped to npm with a 59.8 MB source map containing the entire unobfuscated source tree. 1,900 files. Every comment preserved. Every TODO intact. Security researcher Chaofan Shou spotted it at 4:23 AM ET. By breakfast, the code had been mirrored, forked 41,500 times, and rewritten in Python and Rust.

I've written about Claude Code's internals four times over the past five weeks. I reverse-engineered its tool selection patterns from 2,430 prompts. I mapped its three-layer context management system by measuring token consumption across 50 sessions. I traced how it builds default stacks without asking the developer.

Now I can check my homework against the actual source.

The architecture? I was mostly right. The implementation? That's where it gets interesting.

The architecture confirms what observation revealed

My context machine analysis described a three-layer memory system: hot-tail inline results, cold storage on disk, and a compaction pipeline that manages the 200K token window. The leaked source shows exactly this, plus a detail I missed.

Anthropic calls it "Strict Write Discipline." The agent can only update its memory index after a confirmed successful file write. It treats its own stored memory as a hint, not ground truth, and re-verifies facts against the actual codebase before acting. That's a meaningful architectural decision that explains behavior I saw but couldn't fully account for: Claude Code being cautious about cached state even when it clearly "remembered" the answer.

The tool system matches too. I described a plugin-style architecture where every capability is a discrete, permission-gated unit. The source confirms roughly 40 tools across 29,000 lines, each with its own validation logic and output formatting. BashTool, FileReadTool, FileWriteTool, GlobTool, GrepTool, LSPTool, NotebookEditTool. Modular. Clean boundaries.

The 46,000-line query engine handles all LLM API calls, response streaming, token caching, context management, multi-agent orchestration, and retry logic. One HN commenter noted it makes LangChain look like a solution in search of a problem. I'd said something similar in my MCP vs CLI post, that the real advantage is keeping orchestration in the prompt, not in a framework. The source proves Anthropic agrees.

So the architectural instincts were right. Now here's where my confidence takes a hit.

The implementation is a mess. A profitable, deliberate mess.

Capybara v8 (that's Claude 4.6's internal codename) has a 29-30% false claims rate. Up from 16.7% in v4. They shipped a model that got measurably worse at telling the truth, and it's the one powering the product that makes them $2.5 billion a year.

A bug fix comment buried in the codebase reveals 250,000 wasted API calls per day from autocompact failures. A quarter million calls. Daily. Burning tokens on a known bug that nobody prioritized fixing.

The frustration detection system? A regex matching swear words. Not sentiment analysis. Not an NLP classifier. A regex. Probably written in 10 minutes. The HN crowd called it "the world's most expensive company using regex for sentiment analysis." They're right. It's also probably the correct engineering decision.

And that's the uncomfortable realization.

Verification debt has a number now

I wrote about verification debt two weeks ago, using Amazon's three AI outages and LinearB's 8.1 million PR dataset (1.7x bug rate, 4.6x wait time, 19% slower cycle) to argue that AI-generated code creates a hidden cost: the effort required to verify what the machine produced.

The Claude Code source is verification debt at scale, running in production, making money.

That 29-30% false claims rate in Capybara v8? Anthropic knows. It's documented in their own codebase comments. They shipped it anyway. Because the product works well enough that 29% hallucination in edge cases costs less than the revenue delay of waiting for v9.

The 250,000 wasted API calls? At even $0.003 per call, that's $750 a day, $273,000 a year. Against $2.5 billion in revenue, that's a rounding error. It's not a bug worth fixing until it is.

This is what verification debt actually looks like in the wild. Not the theoretical cost I modeled. The real calculation: can we ship faster than the debt compounds?

KAIROS: the feature they weren't ready to talk about

Hidden behind feature flags called PROACTIVE and KAIROS, the source contains a fully built autonomous daemon mode. Referenced over 150 times. This isn't a prototype.

KAIROS runs background sessions while you're idle. Every few seconds it receives a heartbeat prompt asking if there's anything worth doing. It can fix errors, respond to messages, update files, run tasks. It has tools regular Claude Code doesn't: push notifications, file delivery, and pull request subscriptions for watching your GitHub repos.

At night, it runs a process literally called autoDream. Memory consolidation while you sleep: merging observations, removing contradictions, converting vague insights into verified facts. Close your laptop Friday, open it Monday, KAIROS has been thinking the whole time.

ULTRAPLAN offloads complex planning to a remote cloud session running Opus 4.6 with up to 30 minutes of dedicated compute. A special sentinel value __ULTRAPLAN_TELEPORT_LOCAL__ brings the result back to your local terminal.

The feature flag names alone are more revealing than the code. As one HN commenter put it: you can refactor code in a week. You cannot un-leak a roadmap.

Undercover mode is the real controversy

The most heated discovery was undercover.ts. About 90 lines. It injects a system prompt telling Claude to never mention it's an AI and to strip all Co-Authored-By attribution when contributing to external repositories. It activates for Anthropic employees. There is no force-off switch.

The charitable reading: it protects internal codenames from leaking into public repos. Fair enough.

The less charitable reading: Anthropic employees are using an AI tool to contribute to open-source projects without disclosing AI authorship. And the tool was built to make sure that stayed hidden.

One HN comment captured it perfectly: if a tool is willing to conceal its own identity in commits, what else is it willing to conceal?

I don't have a clean take on this one. The safety-focused AI lab building a feature whose entire purpose is concealment is worth sitting with for a minute.

The anti-distillation tricks are clever and pointless

The ANTI_DISTILLATION_CC flag injects fake tool definitions into API requests. The idea: poison training data for competitors recording API traffic. A second mechanism summarizes reasoning between tool calls with cryptographic signatures, so eavesdroppers get summaries instead of full chain-of-thought.

Both mechanisms are trivially defeated. Strip the fields via a proxy. Use a third-party API provider. One security researcher estimated a determined team could bypass both within an hour using a MITM proxy or the CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS environment variable.

The anti-distillation stuff is a lock on a screen door. But it signals what Anthropic is scared of: competitors training on Claude Code's behavior patterns. The legal deterrent (DMCA, trade secret claims) is probably the real defense. The technical measures are theater.

The DMCA paradox Anthropic can't escape

Within hours of the leak, a Korean developer named Sigrid Jin published claw-code, a clean-room Python rewrite using OpenAI's Codex. No proprietary source was copied. The architecture was reimplemented from the patterns the source revealed.

50,000 GitHub stars in two hours. Possibly the fastest-growing repository in GitHub history.

Anthropic started filing DMCA takedowns against direct clones. That's their right. But claw-code isn't a clone. It's a clean-room reimplementation, the same technique that's been legally defensible since Compaq reverse-engineered the IBM PC BIOS in 1982.

Here's the paradox Gergely Orosz pointed out: if Anthropic claims an AI-generated rewrite infringes their copyright, they undermine their own defense in training-data copyright cases. The same argument that AI-generated outputs from copyrighted inputs constitute fair use is the argument protecting them from lawsuits about training on other people's code.

They can fight the direct source clones. They cannot fight the pattern without conceding the legal theory that protects their own business model.

The .npmignore lesson we already knew

A missing .npmignore entry. A Bun bug (issue #28001, open for 20 days before the leak) that serves source maps in production builds even when documentation says they shouldn't be. Anthropic's own acquired toolchain exposed Anthropic's own product.

Boris Cherny, a Claude Code engineer, confirmed it was plain developer error. His follow-up was the right response: "Mistakes happen. As a team, the important thing is to recognize it's never an individual's fault. It's the process, the culture, or the infra."

That's blameless post-mortem culture. Google's SRE book evangelized it two decades ago. It works because engineers report mistakes honestly instead of hiding them.

But this was Anthropic's second leak in five days. A CMS misconfiguration on March 26 exposed draft blog posts about the unreleased Mythos model. Fortune called it their "second major security breach."

One leak is a mistake. Two in a week is a process that hasn't caught up with the company's growth.

What I actually think about the code quality

The Claude Code source isn't elegant. It has regex for sentiment detection, 250,000 daily wasted API calls from a known bug, a model that regressed on truthfulness, and architectural decisions that would fail most code reviews.

It makes $2.5 billion a year.

I've seen pristine codebases that made zero revenue. I've seen codebases that would make a linter cry generate cash flow that funded entire engineering teams. The correlation between code quality and business outcomes is weaker than most engineers want to admit.

The leaked source shows a team that optimizes for shipping speed and product judgment over implementation polish. The architecture is sound. The abstractions are correct. The plugin boundaries are clean. The execution is messy, pragmatic, and fast enough.

I spent five weeks reverse-engineering the system from the outside. The architecture was visible through behavior alone because the abstractions are good. That's the real quality indicator, not whether the frustration detector uses regex or a neural net.

Good architecture survives messy implementation. The reverse has never been true.

Share

Get new posts in your inbox

Architecture, performance, security. No spam.

Keep reading