Claude wrote a full FreeBSD remote kernel RCE with root shell

TL;DR Highlight

Anthropic's Claude wrote a complete remote kernel RCE exploit for CVE-2026-4747 (FreeBSD kgssapi stack buffer overflow) from scratch, demonstrating that LLMs have reached the level of automating actual attack code—beyond mere vulnerability analysis.

Who Should Read

Security researchers and systems developers who want to understand how far LLMs can actually be leveraged for vulnerability discovery and exploit automation. Also security team members who want to understand the dual-edged impact of AI-based code generation on security.

Core Mechanics

This vulnerability (CVE-2026-4747) is a stack buffer overflow in the RPCSEC_GSS implementation within FreeBSD's kgssapi.ko kernel module. When the `svc_rpc_gss_validate()` function reconstructs an RPC header into a 128-byte stack buffer (rpchdr[]), it performs no bounds checking on the credential body length (oa_length), allowing an attacker to write arbitrary data onto the kernel stack.
Affected versions are FreeBSD 13.5 (below p11), 14.3 (below p10), 14.4 (below p1), and 15.0 (below p5). The attack targets environments where kgssapi.ko is loaded on an NFS server (port 2049/TCP). Testing was conducted on a FreeBSD 14.4-RELEASE amd64 GENERIC kernel (without KASLR).
FreeBSD 14.x lacks KASLR (Kernel Address Space Layout Randomization), and because the overflowed buffer is an int32_t array, stack canaries (stack overflow detection mechanisms) are also not applied, making exploitation significantly easier. These conditions were the key factors that enabled achieving full RCE → uid 0 (root) reverse shell.
Claude did not discover the bug independently; rather, it was provided with the CVE advisory and write-up and then tasked with writing exploit code from scratch. In other words, it was utilized at the stage of 'exploit implementation automation' rather than 'vulnerability discovery.'
The bug itself was discovered by Nicholas Carlini (at Anthropic) using Claude, and Thai Duong's security company Califio published a detailed write-up including the entire exploit development process and the prompts used. The actual prompt history used is publicly available on GitHub, making it reproducible.
This write-up is part of the 'MADBugs' series and serves as a demonstrated case that LLMs can produce complete kernel-level complex exploits (stack control, return address overwrite, reverse shell connection).
A larger risk pointed out by the community is not the 'exploit writing' itself, but the fact that LLMs could quietly introduce new vulnerabilities into production code they generate. Exploit writing is visible, but the code generation risk accumulating with every PR is much harder to see—this concern was raised prominently.

Evidence

"There were comments emphasizing the need to clearly distinguish that Claude did not 'discover' the bug but was given an already-public CVE write-up and asked to write the exploit. However, accompanying opinions noted that the day when a model like Claude could independently uncover CVEs given only kernel source code and a VM environment is not far off. A key argument was made about the importance of the distinction between vulnerability discovery and exploit writing: writing an exploit for a documented CVE is a well-scoped task, but the reverse direction—new vulnerabilities quietly slipping into production code written by LLMs—poses an even greater threat. Exploit capability is visible, but code generation risk is distributed across every PR and hard to surface. There were questions about the KASLR situation in FreeBSD 15.x in response to the note that FreeBSD 14.x lacks KASLR, with a commenter noting they couldn't find relevant mentions in release notes or the mitigations(7) man page, and sharing a link indicating that NetBSD has already implemented KASLR. A comment shared a YouTube link to a presentation titled 'Black-Hat LLMs' that had been released a few days prior, citing it as another example of the trend of LLMs becoming increasingly capable at vulnerability discovery and exploitation. There was also a perspective emphasizing the positive side of automated vulnerability discovery—that the hardest part has always been finding vulnerabilities, not fixing them, and that experts currently doing this work have strong incentives not to disclose, so automated discovery could ultimately be a major benefit even if the transition period is unsettling."

How to Apply

"If you are currently running FreeBSD 13.5, 14.3, 14.4, or 15.0 with kgssapi.ko loaded on an NFS server, you must immediately apply the patch for each respective version (13.5-p11, 14.3-p10, 14.4-p1, 15.0-p5 or later). Until the patch is applied, blocking port 2049/TCP from untrusted networks is the minimum mitigation measure. For those looking to leverage LLMs in security research or red team activities, you can reference the prompt history published in Califio's write-up (GitHub link) to construct a 'CVE advisory → exploit code generation' workflow. However, this should only be used for PoC validation of already-disclosed vulnerabilities. For teams adopting AI code review tools or generating code with LLMs, rather than focusing solely on 'exploit potential,' you should establish a process that regularly uses static analysis tools in parallel to check for patterns such as missing bounds checks in LLM-generated code. The root cause of this case was also a simple missing bounds check."

Code Example

snippet

// Vulnerable code (sys/rpc/rpcsec_gss/svc_rpcsec_gss.c)
static bool_t
svc_rpc_gss_validate(struct svc_rpc_gss_client *client, struct rpc_msg *msg,
                     gss_qop_t *qop, rpc_gss_proc_t gcproc)
{
    int32_t rpchdr[128 / sizeof(int32_t)]; // Only 128 bytes allocated on the stack
    int32_t *buf;

    memset(rpchdr, 0, sizeof(rpchdr));

    // Write 32 bytes of fixed-size RPC header fields
    buf = rpchdr;
    IXDR_PUT_LONG(buf, msg->rm_xid);
    IXDR_PUT_ENUM(buf, msg->rm_direction);
    IXDR_PUT_LONG(buf, msg->rm_call.cb_rpcvers);
    IXDR_PUT_LONG(buf, msg->rm_call.cb_prog);
    IXDR_PUT_LONG(buf, msg->rm_call.cb_vers);
    IXDR_PUT_LONG(buf, msg->rm_call.cb_proc);

    oa = &msg->rm_call.cb_cred;
    IXDR_PUT_ENUM(buf, oa->oa_flavor);
    IXDR_PUT_LONG(buf, oa->oa_length);

    if (oa->oa_length) {
        // Bug: No bounds check on oa_length!
        // Only 96 bytes remain after the first 32 bytes.
        // If oa_length > 96, overflow beyond rpchdr →
        // overwrites local variables → saved registers → return address
        memcpy((caddr_t)buf, oa->oa_base, oa->oa_length);
        buf += RNDUP(oa->oa_length) / sizeof(int32_t);
    }
    // gss_verify_mic() call — but the overflow has already occurred
}

Terminology

RCEShort for Remote Code Execution. A vulnerability that allows an attacker to execute arbitrary code on a target system over a network, and is considered the most severe class of security vulnerability.

KASLRKernel Address Space Layout Randomization. A security technique that randomizes the memory addresses at which the kernel is loaded each time, making it difficult for attackers to predict memory locations such as return addresses. FreeBSD 14.x lacks this feature, which made exploitation easier.

스택 카나리A stack overflow detection technique that checks whether a specific value stored at a certain location on the stack has been tampered with before a function returns. If the canary value has changed, it determines that an overflow occurred and terminates the program.

RPCSEC_GSSA framework that provides security authentication (such as Kerberos) for RPC communications including NFS. In FreeBSD, it is implemented as the kgssapi.ko kernel module, which contained this vulnerability.

스택 버퍼 오버플로우A vulnerability that occurs when a function writes more data than the size of the buffer allocated on the stack (temporary memory space). It can overwrite adjacent memory (local variables, return addresses, etc.) to manipulate program flow.

PoCProof of Concept. A minimal attack code or demo written to prove that a vulnerability is actually exploitable.