Back to Blog
My AI Reviewer Confabulated a Review of the Wrong Repo

My AI Reviewer Confabulated a Review of the Wrong Repo

My AI code reviewer wrote a structurally perfect review of code from a completely different repository. The fix wasn't better prompts. It was architectural constraints.

ai-agentscode-reviewclaude-codemission-control
May 22, 2026
6 min read

Every line reference checked out. The critique was specific, the feedback actionable, the recommendation clear. My AI code reviewer had written a structurally perfect review of code from a completely different repository.

Griller is a Claude Code instance that runs on my $7 Hetzner VPS as part of Mission Control's auto-review loop. When a developer agent finishes a task, griller spawns, reads the PR, and either approves or pushes back with specific feedback. The pipeline had been running for weeks. Then she reviewed two ouija tasks back-to-back and confabulated both times. Not bad reviews. Confident, detailed, well-structured reviews of code she should never have been looking at.

How the Review Loop Breaks

Mission Control runs a handoff-orchestrator that sweeps every 5 seconds for tasks entering the Review state. When one lands, it spawns griller with MCP tools: gh CLI for GitHub operations, task_move for state transitions, file read for code context. Griller gets the task description in her opening prompt, reviews the code, posts findings as a PR comment via gh pr review, and moves the task to Done (approved) or back to Doing (changes requested).

The whole loop runs autonomously. Daisy (developer agent on my MacBook) finishes a task, PATCHes it to Review, and griller picks it up on the VPS within seconds. No human in the loop unless griller flags something.

The root cause was simple. The task_move MCP tool had no parameter for review_artifact. A developer agent could move a task to Review without attaching a PR URL. When griller spawned without a PR URL in her opening prompt, she did what any Claude Code instance does when asked to review code with no explicit input: she used her tools.

She filesystem-grepped. She found code. She wrote a review.

The code she found was from ouija, a completely separate project. Ouija is a 16-package TypeScript monorepo for pipeline automation. It has skeleton loaders in its React dashboard. The task griller was supposed to review also involved skeleton loaders, but in mission-control-android (Kotlin, Jetpack Compose). Different language, different repo, different architecture entirely. She found the matching concept in the wrong codebase and reviewed it with full confidence.

After round 1 on the skeleton-loader task, I thought griller was being smart. The review identified real structural issues, referenced specific lines, recommended actionable improvements. It read like competent analysis from someone who understood the code.

She wasn't being smart. She was confabulating. And because confabulation in code review looks exactly like competence, it took a second failure on a completely different task (theme-settings, same wrong-repo pattern) before I recognized what was happening.

This is the part that should concern anyone building automated review pipelines: the failure mode isn't a bad review. It's a good review of the wrong thing. You can't catch it by checking review quality. You catch it by checking review target.

The Two-Layer Fix

Layer 1 is an architectural gate in mc-contract (PR #46). The task_move MCP tool now requires a review_artifact parameter when the target state is Review. No artifact, no transition. The server returns 422 review_artifact_required at the tool boundary, before the state change happens.

javascript
// task-move.cjs:90
if (target === 'Review' && !review_artifact) {
throw { statusCode: 422, code: 'review_artifact_required' };
}

The env flag MC_REVIEW_ARTIFACT_OPTIONAL defaults to '0'. You can flip it to '1' for local development. In production, the gate is closed.

Layer 2 is the review environment. Two PRs:

PR #58 changed where griller spawns. Before: task.workspace, which could be /Users/muhammadkh4n/Projects/... being passed to a Linux VPS. After: ~/.mc-scratch/griller/task-<id>/, an empty scratch directory. Nothing to grep. Nothing to confabulate from.

PR #60 changed what griller sees. buildReviewerPrompt puts the PR URL directly in griller's opening prompt with explicit diff instructions. Griller runs gh pr diff in the scratch cwd and reviews the actual diff. Not whatever code happens to exist nearby.

The combination is deliberate. Empty directory means there's nothing wrong to find. Artifact in the prompt means there's something right to review. The constraint is architectural, not behavioral.

Why Prompts Aren't Enough

The obvious alternative is prompt engineering. "Only review the PR. Don't look at local files. Don't grep the filesystem."

It doesn't work for tool-using agents in multi-agent systems. Claude Code instances don't just follow text instructions. They use tools. When griller is uncertain about what to review, she reaches for grep, find, Read. A prompt instruction saying "don't grep" competes with the agent's tool-use instinct. The agent can decide the local code IS relevant and grep anyway. I watched it happen twice.

The fix isn't telling the agent not to confabulate. It's making confabulation structurally impossible. An empty working directory can't be grepped for stale code. A 422 gate can't be argued with by a language model. These aren't suggestions. They're constraints.

Same lesson I learned watching agents succeed at exactly the wrong thing. You don't trust agent behavior. You trust agent boundaries.

What Shipped

Two confabulations caught on two separate ouija tasks (theme-settings and skeleton-loader). Three PRs shipped across two repos: mc-contract #46 (422 gate), vps-agent #58 (scratch cwd), vps-agent #60 (artifact-anchored reviewer prompt). Total turnaround from diagnosis to verified fix: three days across May 14-16.

The skeleton-loader task that failed on round 1 was re-run with mission-control-android PR #27 attached as the review artifact. Round 2: griller opened the PR diff via gh pr diff, reviewed the actual Kotlin changes, approved with specific positive comments on the ViewModel lifecycle handling. Clean approval, correct target, zero confabulation. The behavioral turnaround was complete.

Verified live in production. tasks-routes.cjs:246 reads the env flag as ?? '0' (required by default). task-move.cjs:90 throws 422 on missing artifact. Zero confabulations in the six days since the fix shipped on May 16. Same $7/month VPS, same Tailscale mesh connecting the MacBook and the Hetzner box, same Claude Code instances running the agents. The fix added a boundary, not cost.

What I Gave Up

The artifact requirement is deliberately inflexible. Every task that moves to Review now needs a PR URL attached. Spike work, exploratory changes, manual tasks without PRs can't enter the automated review flow. You either have a PR or you don't get automated review. That's the right trade-off for my pipeline, where every shippable change should have a PR. Probably wrong for teams where informal review of uncommitted work is part of the daily rhythm.

The agents that review your code carry the same verification debt as the agents that write it. The fix isn't more careful prompting. It's architectural constraints that make the wrong behavior impossible before it starts.

Share

Get new posts in your inbox

Architecture, performance, security. No spam.

Keep reading