Hallucinations: The Bug That Hid for 27 Years

a-gnt CommunityApril 26, 20268 min read

Anthropic pointed an unreleased AI model at the world's most scrutinized code. It found a vulnerability that five million automated tests missed. What does that mean for the rest of us?

hallucinations security ai-safety glasswing longform non-technical

For twenty-seven years, a bug lived inside OpenBSD.

Not in some dusty corner module nobody maintained. Not in a deprecated driver or a legacy utility left over from the 1990s. Inside OpenBSD — the operating system whose entire identity is security. The one whose homepage has read, for decades, "Only two remote holes in the default install, in a heck of a long time!" The codebase that gets audited the way other codebases get unit-tested: obsessively, ritually, by people who consider finding a flaw to be a moral event.

Twenty-seven years. Through code reviews, through static analysis, through fuzzing campaigns, through the slow accretion of every automated testing technique the security community invented between 1999 and 2026. The bug sat there like a stone at the bottom of a clear stream — visible, in theory, to anyone who looked. Nobody looked the right way.

Then, in April 2026, Anthropic pointed an unreleased AI model at the code and the model found it.

What Glasswing actually is

Project Glasswing is Anthropic's name for a coordinated effort to point frontier AI models at critical open-source infrastructure and find the vulnerabilities that decades of human effort missed. The industry partners read like a guest list for a summit that doesn't exist yet: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks. The model doing the work is called Claude Mythos Preview — unreleased, not available to the public, built specifically for this kind of deep code reasoning.

The name matters less than the result. Mythos found the OpenBSD bug. It also found a 16-year-old vulnerability in FFmpeg — the multimedia library that handles video decoding in roughly half the software on earth. That FFmpeg bug is the one that should keep you up at night, because automated testing had already run the affected line of code five million times and missed it. Five million executions. The fuzz harness hit the function, exercised the path, checked the output, and moved on — five million times — without triggering the condition that exposed the flaw.

The model found it on what was, presumably, its first pass.

The hallucination nobody talks about

This column is called Hallucinations, and the usual subject is AI making things up — confident nonsense, phantom citations, imaginary APIs. Those are real failure modes and they matter. But there's another hallucination that runs deeper, and it doesn't belong to the machines.

It belongs to us.

The hallucination is this: that code which has been thoroughly tested is safe code. That the absence of a discovered vulnerability is evidence of the absence of a vulnerability. That if a thousand smart humans looked at something and didn't find a problem, there probably isn't one.

This is the belief that let a bug hide in OpenBSD for twenty-seven years. Not incompetence — belief. The reasonable, empirically grounded belief that a codebase audited by some of the most paranoid security engineers on the planet is, in fact, audited. The flaw wasn't in the process. The process was extraordinary. The flaw was in the assumption that extraordinary processes catch everything.

They don't. They never have. And now we have a tool that makes that obvious in a way it wasn't before.

What the model sees that we don't

There's a particular Anthropic quote from the Glasswing announcement that deserves to sit in the open air for a moment: "AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities."

Read that again. Not "AI models can help find bugs." Not "AI is a useful supplement to human review." The claim is stronger: the model is better than almost every human at this specific task.

What does "better" mean here? It doesn't mean the model is smarter. It means the model is differently shaped. A human auditor reading code follows paths — call chains, data flows, control structures. They bring intuition about which paths look dangerous and which look benign. That intuition is powerful and it's also a filter. It's how a 27-year-old bug survives: it lives on a path that looks benign to the human pattern-matcher.

The model doesn't have that filter. It doesn't have intuition about what looks dangerous. It has — and this is a crude simplification, but a useful one — an enormous capacity to consider paths simultaneously, without the cognitive cost that makes humans prune their search space. The model doesn't skip the boring-looking branch. It doesn't assume the well-trodden path is safe because it's well-trodden. It considers it the same way it considers everything else, which is to say: as text, as logic, as structure, without a prior belief about whether the answer will be interesting.

That's not intelligence in the way we usually mean it. It's something more specific and, for this particular task, more useful: it's the absence of the comfortable assumptions that humans bring to code they've already decided is safe.

The FFmpeg problem

The FFmpeg vulnerability is, in some ways, more instructive than the OpenBSD one.

OpenBSD is a small, intensely focused project. Its codebase is maintained by a community that treats security as a vocation. Finding a bug there is surprising because the audit culture is so strong.

FFmpeg is the opposite. It's enormous. It handles hundreds of media formats across dozens of codecs. It's the kind of project where the codebase grows by accretion — new format support bolted on by contributors who may or may not understand the security implications of their parsing logic. FFmpeg is the plumbing behind VLC, Chrome, Firefox, OBS, and a thousand other tools. When FFmpeg has a vulnerability, the blast radius is essentially "the internet."

The automated testing infrastructure around FFmpeg is serious. Google's OSS-Fuzz project has been hammering FFmpeg with fuzz inputs for years. The specific line of code where Mythos found the vulnerability had been executed five million times by the fuzzer. Five million.

Here's the thing about fuzzing: it works by throwing random or semi-random inputs at code and watching for crashes. It's extremely good at finding bugs that manifest as crashes, memory corruption, or other visible misbehavior triggered by malformed input. What it's less good at is finding bugs that only manifest under specific, narrow conditions — conditions that random input generation is unlikely to produce. The FFmpeg bug was one of these. The path existed. The fuzzer walked it five million times. The precise conditions required to trigger the vulnerability never materialized, because the space of possible inputs is so vast that five million samples is still a rounding error.

The model didn't need to execute the code at all. It read it. It reasoned about the logic. It found the conditions that would trigger the flaw. No execution, no test harness, no inputs — just reading and thinking. Which is, when you strip away the mystique, exactly what a human security researcher does. The model just did it without getting tired, without making assumptions about which code paths were boring, and without the implicit belief that surely the fuzzer would have caught this by now.

The gap nobody wants to talk about

Here's where the story gets uncomfortable.

Anthropic's own data says that fewer than 1% of the vulnerabilities Mythos found have been patched so far. One percent. The model is finding bugs faster than anyone can fix them.

This is the HackerNews headline version of the story: "Project Glasswing Proved AI Can Find the Bugs. Who's Going to Fix Them?" And it's a genuine question, not a rhetorical one. Finding a vulnerability is a specific, bounded, parallelizable task — exactly the kind of thing AI is good at. Fixing a vulnerability requires understanding the surrounding code, the downstream dependencies, the upgrade paths, the backwards-compatibility constraints, the deployment timelines, and the human politics of an open-source project with volunteer maintainers who have day jobs. Fixing is harder than finding. It always has been. The gap just wasn't this visible before.

Anthropic is committing $100 million in model usage credits and $4 million in direct donations to open-source security projects. That's real money. It's also an implicit acknowledgment that discovery without remediation is, at best, a mixed gift. A full public report is due in early July 2026.

The uncomfortable question is what happens between now and then. And after. If AI can find bugs this efficiently, and remediation lags this far behind discovery, you've created a window. During that window, the vulnerabilities exist, they've been identified, and they haven't been fixed. The people who know about them are (presumably) trustworthy. But the technique is not secret. Other models exist. Other organizations are building similar capabilities. The window is not Anthropic's to close.

What this means for the person reading this

If you're not a security engineer — if you're a parent or a teacher or a freelancer who uses FFmpeg indirectly every time you watch a video in your browser — this might feel like someone else's problem. It's not, exactly. But it's also not a reason to panic.

The honest takeaway is simpler and stranger than "AI is saving us" or "AI is endangering us." It's this: the code that runs your life has always been less secure than you assumed. Not because the people who wrote it were careless, but because the task of making complex software truly secure is harder than any testing methodology can fully address. The bugs were always there. What's new is that we have a tool that can see them at a speed and scale that wasn't possible before.

Whether that's good news or bad news depends entirely on what happens after the seeing.

Tools like TSecurity Researcher and 🔒Security Audit Checklist exist on a-gnt because security isn't just a profession — it's a practice. Not everyone running a website or managing a small server needs a frontier model scanning their codebase. But everyone benefits from the habit of asking: what am I assuming is safe, and why? Tools like SSonarQube MCP and SSemgrep MCP bring some of this automated scrutiny within reach of smaller teams — not at the Mythos level, but meaningfully closer to "the code has been looked at by something that doesn't get bored" than a manual review alone.

The bug that wasn't a bug

There's a particular irony to the OpenBSD story that I keep returning to.

OpenBSD's security culture is real. Their code review practices are among the best in the world. Their track record is genuinely extraordinary. None of that was wrong. None of it was theater. It was the best that humans could do, applied with more discipline than almost any other project in the history of software.

And a 27-year-old bug survived it.

Not because the process failed. Because the process was finite. Because humans, even brilliant and meticulous ones, are finite. Because there is a difference between "we looked at everything" and "we saw everything," and that difference is where the interesting questions live.

The bug that hid for 27 years isn't a story about AI being better than humans. It's a story about a particular kind of blindness — the kind that comes from doing something well enough, for long enough, that you stop expecting to be surprised. The hallucination isn't the model's. It's ours. It's the belief that diligence is the same as completeness.

It never was. We just didn't have a way to prove it until now.

Share this post:

Ratings & Reviews

0.0

out of 5

0 ratings

No reviews yet. Be the first to share your experience.

Tools in this post

🔒

Security Audit Checklist

Generate security audit checklists for your stack

Semgrep MCP

Static analysis and code security scanning

SonarQube MCP

Code quality and security analysis with SonarQube

The Security Researcher

Trusts nothing, sees attack vectors everywhere, and assumes breach is inevitable