Skip to Content

271 Firefox Bugs & a 27-Year-Old OpenBSD Flaw

Claude Mythos 5's Defensive Security Resume
Sk Jabedul Haque
Jun 10, 2026 5 min read 13 views
271 Firefox Bugs & a 27-Year-Old OpenBSD Flaw
Navigation
10 Sections
    When Mozilla's security team got early access to Claude Mythos, the model found 271 vulnerabilities in Firefox within two weeks — more than most human teams find in a year. Then it found a bug in OpenBSD that had survived 27 years of expert scrutiny. These aren't benchmark scores. They're real-world results that redefine what defensive AI security can achieve.

    What You'll Learn

    • How Claude Mythos found 271 Firefox vulnerabilities in a single evaluation cycle and what it means for browser security
    • Why a 27-year-old OpenBSD bug survived decades of human security audits and how Mythos finally uncovered it
    • The disruptive economics of AI-powered offensive security — GPT-5.5 solved a 12-hour challenge in 10 minutes for $1.73
    • How Project Glasswing is expanding defensive AI access to 150+ organizations in 15 countries while the access paradox remains unresolved

    The Mozilla Breakthrough: 271 Vulnerabilities in Two Weeks

    In April 2026, Mozilla announced that Firefox 150 included fixes for 271 security vulnerabilities that were identified during an early access evaluation of Anthropic's Claude Mythos Preview. The results were published on the official Mozilla Blog and covered by Ars Technica, CSOOnline, SecurityWeek, and Bruce Schneier's security blog. Mozilla's CTO stated that Mythos Preview was "every bit as capable" as the world's best human security researchers, a validation that the model's defensive capabilities are not theoretical but immediately practical.

    The evaluation was one of the first public validations of Project Glasswing's defensive mission — the idea that frontier AI models can be applied to cybersecurity defense rather than offense. Mozilla's security team gave Mythos Preview access to Firefox's codebase, and the model systematically identified vulnerabilities across the browser's architecture. The 271 figure dwarfs historical vulnerability discovery rates for comparable evaluation periods and represents a step change in what automated security analysis can achieve.

    The vulnerabilities were patched in Firefox 150, meaning that no end users were exposed to zero-day risk from the evaluation itself. Mozilla's responsible disclosure process ensured that each vulnerability was verified, triaged, and fixed before the results were made public. This stands in contrast to offensive AI security research, where discovered vulnerabilities may be weaponized before patches are available. Anthropic's safety guardrails for Mythos include strict access controls that made this responsible disclosure possible.

    The 27-Year-Old OpenBSD Bug: What the Entire Security Community Missed

    During the same AISI evaluation period that produced the Mozilla results, Claude Mythos discovered a vulnerability in OpenBSD's TCP SACK handling that had survived since the operating system's early development in the late 1990s. OpenBSD is widely regarded as one of the most security-focused operating systems in the world, with a development philosophy that prioritizes proactive security, code correctness, and integrated cryptography. The project's rigorous code review process has made it a benchmark for secure software engineering for nearly three decades.

    The vulnerability was in OpenBSD's TCP Selective Acknowledgment (SACK) state management, where the operating system tracks which data packets have been received as a singly linked list of holes. The bug was subtle enough that it evaded detection through 27 years of expert code audits, automated testing, and real-world deployment across critical infrastructure. Anthropic's official red team blog confirmed the find, noting that the model had discovered vulnerabilities it was never explicitly trained to find — a capability that has significant implications for how AI security tools are evaluated.

    Anthropic's red team also reported that Mythos found a 16-year-old bug in FFmpeg that had survived five million automated tests, further demonstrating that the model's vulnerability discovery capabilities extend beyond any single codebase or vulnerability class. The FFmpeg find is particularly significant because it shows that Mythos can identify bugs in heavily tested, widely deployed software where traditional fuzzing and automated testing had failed. ExploitBench results previously demonstrated Mythos's offensive capabilities, but these defensive discoveries are arguably more impactful for enterprise security.

    The implications for the security community are profound. If a model can find vulnerabilities that have survived 27 years of human review in OpenBSD — one of the most rigorously audited codebases in existence — then the security industry must reconsider the assumption that mature, well-reviewed software is adequately secure. The bugs were not in obscure, rarely-used code paths. They were in core networking functionality that every OpenBSD system depends on.

    The Economics of Defense vs Offense

    The AISI evaluation of GPT-5.5 provides the sharpest illustration of why AI-powered security creates an asymmetric threat landscape. The UK AI Security Institute found that GPT-5.5 could solve a 12-hour reverse engineering challenge in 10 minutes at a cost of $1.73. At scale, attackers can run 10 attempts and achieve 3 successful intrusions for under $20. The marginal cost of an offensive AI security operation approaches zero, while the defensive cost — auditing codebases, patching vulnerabilities, maintaining security infrastructure — remains labor-intensive and expensive.

    The defensive economics are more favorable but still challenging. Mozilla's evaluation of 271 vulnerabilities required Mythos access plus human security researcher review for triage and patching. The AI accelerates discovery, but human judgment is still required to assess exploitability, prioritize fixes, and verify that patches do not introduce regressions. The cost of a full AI red team assessment ranges from $75,000 to $200,000, compared to $50,000 to $150,000 for a traditional network red team engagement. The AI assessment covers more ground but at a higher absolute cost, though the cost per vulnerability discovered is dramatically lower.

    The asymmetry becomes stark when comparing the resource requirements for offense versus defense. An attacker needs only one unpatched vulnerability to achieve their objective. A defender must find and patch every vulnerability in their attack surface. AI makes the attacker's job cheaper by orders of magnitude while also making the defender's job more scalable. The net effect is a race condition where both sides gain capability, but the attacker benefits from lower marginal costs and the freedom to choose their target. Comparing AI model pricing shows that offensive AI operations using publicly available models can be conducted at a fraction of the cost of equivalent human-led operations.

    DimensionDefensive AI (Mythos)Offensive AI (GPT-5.5)
    Speed271 vulns in 2 weeks (Mozilla)12-hr RE challenge in 10 min
    Cost per operation$75K-$200K red team assessment$1.73 per intrusion attempt
    Success rateVerified patches deployed3/10 successful intrusions
    ScalabilityConstrained by human reviewNear-zero marginal cost
    Access restrictionProject Glasswing gatedPublicly available

    Project Glasswing: From 50 to 200 Organizations

    On June 2, 2026, Anthropic announced the expansion of Project Glasswing from approximately 50 partner organizations to roughly 200 organizations across more than 15 countries. The expansion, reported by Reuters, TechCrunch, CyberScoop, and HelpNetSecurity, brings Mythos's defensive capabilities to critical infrastructure sectors including healthcare, energy, and communications. The program was originally launched in April 2026 in partnership with Amazon Web Services, Apple, Broadcom, Cisco, and other major technology companies.

    The expansion strategy reveals Anthropic's approach to managing the access paradox: rather than making Mythos broadly available or keeping it completely restricted, the company is scaling access through a carefully curated partnership program. Each new Glasswing partner undergoes vetting to ensure they have the infrastructure to handle Mythos access responsibly, including secure development environments, incident response capabilities, and responsible disclosure processes. The 15-country scope makes Project Glasswing one of the largest structured AI security programs in existence.

    The sectors targeted in the expansion reflect the program's focus on societal risk rather than commercial opportunity. Healthcare organizations face increasing ransomware and data breach risks. Energy infrastructure operators confront nation-state cyber threats. Communications providers manage attack surfaces that span billions of users. By targeting these sectors, Project Glasswing addresses the highest-impact vulnerabilities first, rather than optimizing for commercial return. Enterprise AI deployment frameworks like those used in Glasswing offer a template for how organizations can integrate AI security tools without creating additional risk.

    The AI Security Paradox: Restriction vs Capability

    The central tension in AI-powered security is what David Sacks, Trump cabinet tech advisor, called the access paradox: "Mythos is not magic, not a doomsday device... all leading Chinese models will reach the same capability level within six months." Every day that Mythos is restricted to prevent misuse is a day that attackers might develop similar capabilities independently, potentially without the safety infrastructure that Anthropic has built. The restriction that protects society from offensive misuse also delays the defensive benefits that Mythos could provide at scale.

    Sacks's assessment that Chinese models will reach equivalent capability within six months is consistent with the broader trajectory of frontier AI development. China's DeepSeek and other major AI labs have demonstrated rapid capability advancement, and the cybersecurity domain is unlikely to remain a Western monopoly. If the access paradox is real, then the window in which Anthropic can shape the responsible deployment of AI security tools is finite, and each month of restricted access represents a missed opportunity to establish defensive norms before offensive capabilities become commoditized.

    The comparison with GPT-5.5's offensive capabilities sharpens the paradox further. OpenAI's GPT-5.5 is publicly available and classified by OpenAI as having "High" cybersecurity capability under its Preparedness Framework. AISI found that GPT-5.5 can perform nearly on par with Mythos in many offensive scenarios. The result is a landscape where the most capable defensive AI is restricted, while comparably capable offensive AI is publicly accessible. This asymmetry favors attackers and puts defenders at a structural disadvantage that no amount of access restriction can fully address. Understanding the difference between Fable 5 and Mythos helps clarify why Anthropic treats the two models differently from a safety perspective.

    What Defensive AI Security Means for Enterprise Teams

    For enterprise security teams evaluating AI-powered vulnerability detection, the Mozilla and OpenBSD results establish a clear baseline: AI can find vulnerabilities that human teams miss, even in mature, well-audited codebases. The practical implication is that security teams should incorporate AI-assisted code review into their development pipelines, not as a replacement for human review but as a complement that catches the edge cases human reviewers are most likely to overlook.

    The immediate actionable takeaway is that any organization running significant software infrastructure should evaluate whether they qualify for Project Glasswing access or similar programs. The alternative — relying solely on traditional vulnerability scanning and human-led penetration testing — leaves organizations exposed to vulnerabilities that AI can now routinely identify. For organizations that cannot access Mythos through Glasswing, the same defensive capabilities are available through Anthropic's API using Mythos directly, albeit without the structured partnership framework that Glasswing provides.

    The long-term strategic implication is that the security industry must prepare for a world where AI-powered vulnerability discovery is the norm rather than the exception. This means investing in patch management infrastructure that can keep pace with AI-accelerated discovery rates, training security teams to work alongside AI tools effectively, and developing organizational policies for responsible disclosure that account for the speed and scale of AI-driven security research. The 271 Firefox vulnerabilities and the 27-year-old OpenBSD bug are not anomalies. They are the first data points in a new security paradigm where the most capable vulnerability hunter is not human. Comparing GPT-5.5 against Claude models provides additional context on how different frontier models perform on security-relevant tasks.

    Conclusion

    Anthropic's Claude Mythos has demonstrated that defensive AI security is not a theoretical concept but a practical capability with measurable results. Mozilla's 271 patched vulnerabilities and the 27-year-old OpenBSD bug that survived decades of expert review are evidence that AI can find what humans miss. Project Glasswing's expansion to 150+ organizations in 15+ countries shows that the infrastructure for responsible AI security deployment is being built. But the access paradox — where the best defensive AI is restricted while comparable offensive AI is publicly available — remains the defining challenge. For security teams, the message is clear: AI-powered vulnerability discovery is here, and the organizations that integrate it into their security posture will have a structural advantage over those that do not.

    Frequently Asked Questions

    Claude Mythos found 271 security vulnerabilities in Firefox during an early access evaluation. Mozilla patched all 271 in Firefox 150, released in April 2026.
    Mythos discovered a vulnerability in OpenBSD's TCP SACK handling that had existed since the operating system's early development in the late 1990s, surviving 27 years of security audits.
    Project Glasswing is Anthropic's initiative to provide Claude Mythos access to organizations for cybersecurity defense. It expanded from 50 to approximately 200 organizations across 15+ countries in June 2026.
    Offensive AI operations using GPT-5.5 can cost as little as $1.73 per attack attempt, while defensive AI red team assessments range from $75,000 to $200,000.
    No. Claude Mythos is restricted through Project Glasswing partnerships. However, GPT-5.5, which performs comparably on many offensive security tasks, is publicly available.
    The access paradox is that Anthropic restricts Mythos to prevent misuse, but comparable offensive AI capabilities are publicly available through GPT-5.5, giving attackers a structural advantage.
    Project Glasswing partners include organizations in healthcare, energy, communications, and other critical infrastructure sectors. Partners are vetted for responsible security infrastructure and disclosure processes.
    Yes. Anthropic's red team reported that Mythos also found a 16-year-old bug in FFmpeg that had survived five million automated tests.
    Sk Jabedul Haque

    Sk Jabedul Haque

    Founder & Chief Editor

    Building India's most trusted finance education platform — simplifying news, calculators, and market trends so anyone can understand and invest confidently.