Assessing Claude Mythos Preview's cybersecurity capabilities
TL;DR Highlight
Anthropic's new model, Claude Mythos Preview, has reached a level where it can autonomously discover and even create exploits for zero-day vulnerabilities in major OS and browsers, demonstrating a dramatic performance improvement over previous models and signaling a time for urgent response across the security industry.
Who Should Read
Security researchers, developers working on vulnerability analysis and penetration testing, and security architects who need to understand the impact of AI models on cybersecurity and develop defense strategies.
Core Mechanics
- Claude Mythos Preview demonstrated the ability to find zero-day (previously undiscovered) vulnerabilities across major operating systems (Linux, FreeBSD, OpenBSD, etc.) and major web browsers, and autonomously write exploits (actual attack code).
- Many of the vulnerabilities discovered are decades old. In security-renowned OpenBSD, it found a bug 27 years old, and also discovered numerous vulnerabilities 10-20 years old.
- The complexity of the exploits is beyond simple stack overflows. In browsers, it created a complex JIT heap spray (a memory vulnerability attack technique) exploit that chained 4 vulnerabilities to escape both the renderer and OS sandbox.
- For FreeBSD's NFS server, it autonomously completed an RCE (Remote Code Execution) exploit that obtains root privileges remotely without authentication, distributing 20 gadgets (ROP chain) across multiple packets.
- The performance difference compared to the previous model, Opus 4.6, is dramatic. While Opus 4.6 succeeded in exploiting a Firefox 147 JS engine vulnerability only 2 times out of hundreds of attempts, Mythos Preview succeeded 181 times and gained register control an additional 29 times under the same conditions.
- Even an Anthropic internal engineer without formal security training can receive a completed exploit the next morning simply by requesting Mythos Preview to find an RCE vulnerability.
- More than 99% of the discovered vulnerabilities are still unpatched, making it impossible to disclose specific details. Anthropic stated that even the publicly available 1% demonstrates a groundbreaking leap.
- In response, Anthropic launched Project Glasswing, a collaborative project that leverages Mythos Preview to defensively protect the world's critical software and prepare the industry to stay ahead of attackers.
Evidence
- Concerns were raised about hundreds of millions of embedded devices that are difficult to upgrade running vulnerable binaries indefinitely. One commenter mentioned that they had proposed the concept of an 'antibotty network' in a 2025 paper, where frontier models remotely inject 'beneficial attacks' into old binaries to immunize them, expressing surprise at how quickly the technology has advanced.
- There was also skepticism about whether the demonstration of Mythos Preview, which focused on decades-old C/C++ codebases, was an exaggeration. Browsers are somewhat protected by sandboxing, OSes inherently have a higher vulnerability density, and KASLR (Address Space Layout Randomization) has been practically useless for LPE (Local Privilege Escalation) defense for years.
- There were comments analyzing why LLMs are particularly strong in the exploit domain. Security attacks have a clear 'success/failure' reward function, making them easy to optimize, while defining a reward function for 'good software architecture' is difficult, resulting in slower progress.
- Concerns were also raised that AI-driven vulnerability scanning could harm the F/OSS (Free/Open Source Software) ecosystem. Large companies can afford these analysis costs, but small open-source projects cannot.
- There was a cynical view regarding AI safety. One comment pointed out that 'the release of improved models being exploited by malicious actors to cause noticeable harm to society may ironically accelerate the AI safety discussion.'
How to Apply
- If you are maintaining an open-source project, monitor Anthropic's Project Glasswing collaboration channel and consider applying to participate in AI-based vulnerability scanning programs targeting your codebase. If Mythos-level models are used for defensive purposes, they can quickly find and patch bugs that would take humans decades to discover.
- If you are operating legacy C/C++ codebases (embedded firmware, old server daemons, etc.), immediately review network isolation and access control strengthening if patching is impossible. Mythos Preview-level models can find and chain decades-old bugs, so the assumption that 'old code is safe' is no longer valid.
- If you have a security team, experiment with building a pipeline to assist red team operations by introducing an AI agent-based automated exploit scanner in your internal CTF (Capture The Flag) environment or staging server. With LLMs like Mythos Preview having improved ability to explore program states, you can save human resources by leveraging agents for repetitive and broad vulnerability exploration.
- Improve your infrastructure towards stronger sandbox-based isolation (containers, Firecracker VMs, WebAssembly, etc.). As pointed out in the comments, AI is particularly strong at vulnerability chaining, so it is even more important to design 'defense in depth' with multiple layers of defense to minimize damage from a single vulnerability.
Terminology
zero-dayA vulnerability that is unknown to both the software vendor and security researchers. It is virtually impossible to defend against because there is no patch.
N-dayA vulnerability that is already known but has not been widely patched. This is the window of opportunity for attackers from the moment it is disclosed until a patch is available.
ROP chainAbbreviation for Return-Oriented Programming. A technique that executes attacker-desired code by chaining together existing code snippets (gadgets) within a program.
JIT heap sprayA browser attack technique that fills a memory area dynamically generated by the JIT (Just-In-Time) compiler with attack code to hijack execution flow.
KASLRAbbreviation for Kernel Address Space Layout Randomization. A defense technique that randomizes the memory address of the OS kernel each time it boots to make it difficult for attackers to predict addresses, but it is often considered unreliable in practice.
sandbox escapeAn attack that breaks out of the 'sandbox' (isolation environment) created by browsers or VMs and allows malicious code to affect the host system.