Kernel code removals driven by LLM-created security reports
TL;DR Highlight
Linux kernel maintainers are removing legacy drivers—ISA, PCMCIA, AX.25, ATM, and ISDN—after AI-generated security bug reports overwhelmed them, demonstrating a drastic response to unmanageable code.
Who Should Read
Linux kernel contributors, open-source project maintainers, or developers utilizing or considering LLM-based automation tools for vulnerability detection.
Core Mechanics
- Linux kernel maintainers proposed patches to remove ISA/PCMCIA Ethernet drivers, parts of the PCI driver, the AX.25 and amateur radio subsystem, ATM protocols and drivers, and the ISDN subsystem.
- The removal reason isn't technical flaws, but a surge in security bug reports automatically generated by LLMs. Maintainer comments explicitly state the need to remove code to protect mental health due to the inability to process AI-generated reports.
- AX.25 (amateur radio packet communication protocol) and related HAM radio drivers already received many bug reports from syzbot (kernel automated fuzzing tool), and the influx of AI reports finalized the removal decision.
- Most of the removed code are drivers or protocols for legacy hardware primarily used before the 2010s. ATM has been replaced by MPLS/MetroE, ISDN is virtually obsolete, and laptops with PCMCIA slots haven’t been produced since 2008.
- These codes were in a ‘non-maintained state’ but were included in a large project (Linux kernel), giving the illusion of maintenance. Had they been independent projects, their inactive status would have been apparent years ago.
- The validity of bug reports generated by LLMs is debated. Some linked emails highlight the issue of ‘junk reports’ from AI increasing review burden without adding value.
- The removed code can potentially be continued as out-of-tree kernel modules or userspace implementations. The HAM radio community has already begun discussing a userspace protocol implementation written in a more modern language.
Evidence
- "HAM radio community users expressed regret over the AX.25 removal, and a thread explaining the decision’s background appeared on the linux-hams mailing list, alongside optimistic views about a modern userspace protocol becoming the new standard."
How to Apply
- When integrating LLM-based security scanners or automated bug reporting tools into open-source projects, a report quality filtering layer is essential. Failing to validate AI-generated issues can overwhelm maintainers and lead to code removal, as seen in this case.
- If your project includes legacy drivers or modules that are effectively unmaintained, assess their maintenance status before deploying LLM-based vulnerability scanners. AI tends to report more pattern-based vulnerabilities in older code, potentially flooding the issue tracker with noise.
- If you operate industrial environments requiring legacy hardware support in kernels or system software, verify whether your drivers are on the removal list (ISA, PCMCIA, ATM, AX.25, ISDN) and prepare for out-of-tree module maintenance or userspace alternatives.
Terminology
Related Papers
Did Claude increase bugs in rsync?
rsync 프로젝트에 Claude AI가 도입된 이후 버그가 늘었다는 소셜 미디어 주장을 실제 데이터와 통계 분석으로 검증한 글로, 결론적으로 Claude 도입 후 릴리즈가 역사적 분포에서 유독 버그가 많다는 통계적 근거는 없었다.
I built a vulnerable app and spent $1,500 seeing if LLMs could hack it
Firebase 취약점을 가진 앱을 직접 제작하고 GPT-5.5, Claude, Deepseek 등 주요 LLM이 자율적으로 해킹할 수 있는지 실험한 결과, GPT-5.5가 70% 성공률로 압도적이었고 Claude는 보안 거부 정책 때문에 능력과 무관하게 낮은 점수를 기록했다.
Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models
LLM이 여러 답변을 의미 단위로 묶어 객관식으로 만들고 스스로 채점해서 '이 답 얼마나 확신해?'를 수치로 뽑아내는 기법.
SkillHarm: Lifecycle-Aware Skill-Based Attacks via Automated Construction
AI 에이전트가 사용하는 'Skill 패키지'에 악성 페이로드를 심으면 최신 모델도 86%까지 뚫린다는 보안 벤치마크.
MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems
RAG, Mem0 같은 LLM 메모리 시스템이 왜 틀린 답을 내는지 자동으로 찾아주는 디버깅 프레임워크
DeepSWE: A contamination-free benchmark for long-horizon coding agents
기존 SWE-bench의 데이터 오염 및 검증 오류 문제를 해결하기 위해 처음부터 새로 만든 코딩 에이전트 벤치마크로, GPT-5.5가 70%로 1위를 차지하고 모델 간 성능 격차가 훨씬 뚜렷하게 드러난다.