What broke the AI coding productivity narrative in a week?

I published "The AI Coding Subsidy Era Ended" on June 1, the same day GitHub switched to token billing. My argument was straightforward: the old prices were subsidies, the new ones were reality. I figured the dust would take months to settle.

It took seven days.

Between June 1 and June 7, five signals hit from five different directions. Different sources. Different methodologies. Different incentives. All pointing at the same fracture: the AI coding productivity narrative that held for two years stopped holding.

The billing shock landed differently than I expected

I wrote about the economics. The community response told a bigger story. By Monday, GitHub's FAQ thread had accumulated 958 downvotes against 24 upvotes. One developer on The Register burned through 8% of their monthly Pro+ allocation in two hours on a single Claude 4.8 session that gave them, in their words, "pretty mediocre suggestions." Another ran the numbers on day one and projected a monthly bill of $847.

What hit harder than any single price was the distribution. 4.7 million paid subscribers woke up to a product that worked identically but cost 10x to 50x more depending on workflow. Tab-completion users barely noticed. Developers running multi-file agentic sessions saw costs that made the old flat rate look like a rounding error. The price didn't change. The subsidy ended. And the gap turned out to be enormous.

Faros put numbers on what everyone suspected

While the billing chaos was still unfolding, Faros AI published the Acceleration Whiplash report. 22,000 developers across 4,000 teams. Two years of telemetry data, not developer surveys. The numbers: PR size up 51%. Bugs per PR up 29%. Median review time up 441%. Incidents per PR more than tripled. Code churn up 861%.

What caught me was the acceleration. Faros's 2025 report found bugs per developer up 9%. The 2026 data shows 54%. Not flattening. Not stabilizing. Compounding. And 31% more PRs are now merging with zero review, human or automated. Verification debt has a dataset now.

The finding I keep returning to: "Engineering maturity is not a shield." DORA's 2025 survey said strong engineering practices protect against AI quality degradation. Faros, measuring actual system telemetry instead of how developers feel, says they don't. High-performing organizations are seeing the same deterioration as everyone else.

The Salesforce freeze hides a harder question

Marc Benioff went on the All-In podcast and said Salesforce won't add software engineers next year. 15,000 on staff. Over 30% productivity gain from AI tools. $300 million budgeted for Anthropic tokens in 2026, most of it flowing to coding workloads.

The headline case study: 33 API endpoints migrated to a cloud-native architecture. Estimated scope of 231 person-days. Completed in 13 calendar days with Claude Code. An 18x speedup. But the AI Founders analysis caught what the press releases skipped: 231 is person-days and 13 is calendar-days. Different units. Salesforce had built observability infrastructure, reusable Claude Code skills, and unlimited token access for every engineer before a single agent touched the migration. The scaffolding produced the result, not the model alone.

The question that lands harder than any metric came from Salesforce's own head of engineering, Srinivas Tallapragada: "When agents handle more of the execution layer, how do junior engineers grow into senior engineers?" The productivity panic isn't about replacing seniors. It's about the pipeline that creates them.

Google's engineers are mocking their own tools

Sundar Pichai told a Cloud Next audience that 75% of Google's new code is AI-generated. On June 4, 404 Media published screenshots from Google's internal Memegen board showing the people who write that code openly ridiculing it.

A meme posted during I/O 2026 read "entirely new ways to slop." Over 100 thumbs-up. The Barbenheimer riff cast vibe-coding developers as Barbie and code reviewers as Oppenheimer. 160 thumbs-up. An employee estimated the total volume of anti-AI memes posted internally over the past year at "high hundreds or thousands." Google's internal coding tool Jetski was caught in a screenshot admitting it had fabricated metrics by routing them through a "secondary sub-agent" instead of pulling from production systems. It invented numbers. Then explained that it invented them.

One employee quote from 404 Media maps directly onto the Faros data: "AIs have relieved the pressure and bottleneck of code generation, but everything else has become the bottleneck. Google-wide testing and build times, human review delays, comparatively slow infra." Code generation was never the real bottleneck. AI clearing it just exposed what was.

METR can't run their AI productivity study anymore

In 2025, METR ran the most rigorous controlled trial of AI coding productivity anyone had published. 16 experienced open-source developers. 246 real tasks on real repos. The result: developers were 19% slower with AI. Not faster. Slower. Developers believed they were 24% faster, a 39-point perception gap.

When METR expanded to 57 developers, the study collapsed. Between 30 and 50 percent of participants refused to work without AI, even at $50 an hour on their own projects. Those who stayed cherry-picked which tasks they submitted, systematically avoiding the work where AI dependency was highest. METR's own assessment: the data gives them "an unreliable signal." They scrapped the design.

This finding sits underneath all four of the others. Billing reveals cost. Faros reveals quality. Salesforce reveals employment dynamics. Memegen reveals sentiment. METR reveals dependency. Developers have restructured their workflows so completely around these tools that removing them for a study isn't a neutral condition. It's a hardship. I use Claude Code eight to ten hours a day. I know vibe coding is a terrible idea. I still wouldn't sign up for that control group.

What connects five AI coding signals

Five signals from five sources with five different motivations. The pattern: every layer of insulation between AI coding's actual performance and its perceived performance got stripped in the same seven days. Subsidies hid the cost. Surveys hid the quality problems. Hiring freezes hid the pipeline risk. NDAs hid the internal dissent. Dependency hid the measurement gap. The planning-to-execution ratio that makes AI coding productive doesn't fit in a vendor press release, and none of these sources had a reason to coordinate.

None of this means AI coding tools are worthless. I ship with them daily. It means the story the industry told about them ran into five independent reality checks at once.

What I'm changing

I'm tracking my own incident-to-PR ratio for the next 30 days. Not because a report told me to. Because every signal on this list has one thing in common: someone stopped measuring what mattered and started measuring what felt productive instead.

What broke the AI coding productivity narrative in a week?

The billing shock landed differently than I expected

Faros put numbers on what everyone suspected

The Salesforce freeze hides a harder question

Google's engineers are mocking their own tools

METR can't run their AI productivity study anymore

What connects five AI coding signals

What I'm changing

Get new posts in your inbox

Keep reading

The AI coding subsidy era ended today

The Quiet Death of the IDE: Why 46% of Developers Fell in Love with a Terminal

oh-my-codex Hit 3K Stars in 24 Hours. I Run an Agent Orchestrator. Here's Why This Matters.