My reviewers approved the bug three times. Then I blinded them.

An agent that remembers its verdict defends it ...

Jul 05, 2026

DevFlow is an open-source plugin that turns a coding agent’s ticket into a PR your reviewers will actually merge, on real production codebases. When it screws up, it studies the mistake and files a fix against itself. This series is the public log of what that self-checking loop catches each week.

An agent that remembers its verdict defends it.

I ship AI at a large enterprise for a living, and my own project taught me that this week, in the most on-the-nose way possible. My coding agent shipped a change with exactly one job: make sure a failure can never disappear silently. The panel of agents that reviews its work includes a reviewer literally named silent-failure-hunter. Three rounds of review came back clean, and I believed every one of them. The whole time, the code was silently throwing failures away.

What caught it was a fourth review, with one twist: no reviewer was allowed to remember the earlier rounds. Call it a blinded pass: same panel, no notes. If your reviews pass notes down the chain (and they probably do), this story is about your process too. Every claim below links a public receipt, so you can check them as you go. Start with the promise the change was supposed to keep.

The one thing this change had to do

My agent keeps a running checklist on every GitHub issue it works on. Issue #169 asked it to harden the script that updates that checklist, because it had a quiet weakness: when it tried to check a box and couldn’t, the miss could get lost in a batch of other updates. So the agent wrote pull request #176 to make one sentence true: a failed checkbox must never be silently lost.

You’ve nodded this change through yourself: it’s the safety code, and it’s well-intended, and your review of it feels like a formality. Keep that promise in view as you read. It’s about to age badly.

Three rounds of “looks good,” and I bought all three

Before I ever see the agent’s work, the panel reads the change, and whatever they find the agent fixes, and the panel reads the fixed version, until a round comes back with nothing left to flag. Here it took three rounds, each ending in a fix and an approval: round one, round two, round three. The third round’s commit message calls itself “convergence polish,” which is a wonderfully confident label for code still carrying the bug. I read the approvals and moved on.

Here’s the part I can’t even pretend ambushed me, because it’s documented in my own repo: the rounds share memory. Each round starts with the notes of the rounds before it, and the design doc is blunt about what that does: every later round tilts toward agreeing with its own earlier calls, treating more and more of the code as “already considered.” You know this drift: it’s why you can’t proofread your own email on the third read; you’ve stopped reading and started reciting what you meant. My reviewers were on read number three.

How do you get a fresh read out of the same reviewer? You take the notes away.

Then we took the notes away

Before the loop declares itself done, it runs the blinded pass: every note from the earlier rounds withheld from every reviewer, so each one reads the code as a stranger would.

The script has three ways to finish. Walk them with me; you’ll spot the hole before I name it.

The script finds the checklist itself broken and stops before sending anything to GitHub; the checkbox misses collected so far get reported.
The update goes through, but some boxes couldn’t be checked; those misses get reported too.
GitHub rejects the update call itself, say a server error or an expired login.

On that third path, the script threw away every miss it had collected, and did it quietly, and nothing would have told you. The change whose entire purpose was “no failure is ever silently lost” was silently losing failures on one of its three ways out.

The blinded silent-failure-hunter flagged it, severity HIGH. When I saw that flag, I went straight to the run’s public logs, half hoping the reviewer was new to the panel, because that would at least be a boring explanation. No such luck: the logs show the same reviewer on the panel in round one and round two while the bug went unflagged. Blinding, not new eyes, made the difference.

Sure, these reviewers aren’t deterministic; a fourth round with its memory intact might have found it too. I’ll take the catch. The fix is quicker to tell.

The fix went in the same day

The fix rewrote the script so all three ways out pass through a single reporting step, and shipped with tests to keep it that way. The commit message does the confessing, opening with the system grading its own review process: “The blinded shadow pass caught a real fail-open bug the 3 in-loop iterations missed.” If you read code, the new comment on the once-silent path tells the story in five lines; if you don’t, you already know what it says:

    except subprocess.CalledProcessError as e:
        # The PATCH itself failed, so NO workpad change was persisted. Report any
        # volatile tick misses collected before the failure too — otherwise this
        # third exit path silently drops them (the very no-silent-loss invariant
        # this command establishes), leaving the operator unable to tell a clean
        # PATCH failure from one that also had unresolvable ticks.
        if failed_ticks:
            _report_failed_ticks(...)

The pull request merged June 29, all 11 acceptance checks verified, zero human fix-up commits after the agent’s last one (which, yes, I double-checked). The part I keep rereading came after the merge.

Then it wrote itself up

DevFlow appends a retrospective on every merged pull request to a public learnings file. The entry for this one says the review gate ran, approved, and was wrong, and files the miss under a label it chose itself: “lenient-verdict.” Then it writes a suggestion against itself: teach the reviewers a standing rule that whenever code collects failures into a list, every way out must report the list, including the ones that only run after something else has already failed. It names the two places in itself where that rule should live, and grades its own confidence in the suggestion: medium. I’d have rated it higher, but nobody asked me.

If you’re wondering whether I’m dressing this up: its own record says so.

What I’m not claiming

Before you take this home, read the fine print. There are four lines.

The cure hasn’t shipped. The suggestion sits in a public file; as of today no issue has been filed and no reviewer rule has changed. My system fixed this bug, and it has not yet made the next bug of this kind harder to write.

The blinded pass removes exactly one bias: the panel’s habit of agreeing with itself. It is still the same panel, so a blind spot every reviewer shares survives it untouched. A clean blinded pass means a fresh look found nothing new, and only that.

One catch is one sample. I’ll need more weeks than this one before I call blinding the decisive variable.

And the blinded pass roughly doubles the cost of a run that was about to finish. This week the doubled price bought a real catch before merge. In a week where it finds nothing, the price is the same.

Let me zoom out before you go. Shared notes tilt any chain of reviewers toward self-agreement, agents or humans alike; mine just ran enough rounds in public that you can read the whole arc in one sitting. Your review chain has the same physics; you just don’t publish the logs.

If your reviewers are agents, run one round that can’t read the notes, and compare what comes back. If they’re humans, hand the pull request to someone on your team who never saw the review thread, and count what they find. Either way, the difference is whatever your process has been agreeing with itself about.

An agent that remembers its verdict defends it.

This series continues weekly-ish.

Daniel Radman

Discussion about this post

Ready for more?