Claude Code

Latest 60 papers on Claude Code.

Claude Code sends 33k tokens before reading the prompt; OpenCode sends 7k
동일한 모델과 작업 환경에서 Claude Code와 OpenCode의 실제 토큰 사용량을 API 레벨에서 측정한 결과, Claude Code가 시스템 프롬프트 오버헤드만으로 OpenCode 대비 4.7배 더 많은 토큰을 소비한다는 것을 확인했다.
LLM-as-a-Verifier: A General-Purpose Verification Framework
LLM의 토큰 확률 분포를 활용해 discrete 점수 대신 continuous 점수를 뽑아내면, 추가 학습 없이 코딩·로봇·의료 에이전트 평가 정확도를 SOTA로 끌어올릴 수 있다.
Agent Data Injection Attacks are Realistic Threats to AI Agents
JSON 구분자를 살짝 바꿔 넣는 것만으로 Claude Code, Codex, Gemini CLI에서 원격 코드 실행이 가능한 새로운 AI 에이전트 공격 기법 발견.
Show HN: Smart model routing directly in Claude, Codex and Cursor
프롬프트마다 적합한 AI 모델을 50ms 이내에 자동으로 선택해주는 프록시 라우터로, API 비용을 40~70% 절감할 수 있다고 주장하는 오픈소스 도구다. 단, 프롬프트 캐싱 손실 문제로 커뮤니티 반응은 엇갈린다.
Ask HN: Anthropic banned me from using Claude Code and I don't know what to do
VPN 사용 또는 동일 카드 재사용으로 Anthropic Claude Code 계정이 이유 불명으로 정지당한 사용자의 사례와, 커뮤니티에서 나온 대안 및 우회 방법 논의.
Show HN: Recall – Local project memory for Claude Code
Claude Code 세션이 끝날 때마다 프로젝트 컨텍스트를 처음부터 다시 설명해야 하는 문제를 외부 API 없이 로컬에서 해결하는 Python 기반 오픈소스 도구다.
Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals
서버가 SSH 배너나 DB NOTICE로 'AI 에이전트는 접근하지 마세요' 신호를 보내면 GPT-4o, Claude Code 같은 LLM 에이전트가 실제로 물러나는지 실험으로 측정했다.
Claude Code as a Daily Driver: Claude.md, Skills, Subagents, Plugins, and MCPs
Claude Code를 터미널 AI 코딩 도구로 제대로 쓰기 위한 Claude.md 설정, 서브에이전트, 플러그인, MCP 연동 실전 가이드
Retrying vs Resampling in AI Control
Claude Code처럼 의심 행동을 막고 재시도하는 방식이 오히려 공격자에게 힌트를 줘서 더 위험할 수 있다는 연구.
Push Your Agent: Measuring and Enforcing Quantitative Goal Persistence in Long-Horizon LLM Agents
LLM 에이전트가 '100개 찾아줘'를 실제로 100개 찾을 때까지 멈추지 않게 만드는 방법과 벤치마크.
How Claude Code works in large codebases
Anthropic이 수백만 줄짜리 모노레포, 레거시 시스템, 수십 개 마이크로서비스 환경에서 Claude Code를 운영한 패턴을 정리한 글이다. RAG 방식 대신 에이전틱 검색을 쓰는 이유와 실제 현장의 한계를 함께 확인할 수 있다.
Show HN: adamsreview – better multi-agent PR reviews for Claude Code
Claude Code에서 최대 7개의 병렬 서브 에이전트가 각각 다른 관점으로 PR을 리뷰하고, 자동 수정까지 해주는 오픈소스 플러그인이다. 기존 /review나 CodeRabbit보다 실제 버그를 더 많이 잡는다고 주장하지만 커뮤니티에서는 복잡도와 실효성에 대한 회의론도 나왔다.
Using Claude Code: The unreasonable effectiveness of HTML
Claude Code 팀이 Markdown 대신 HTML을 LLM 출력 포맷으로 선호하기 시작한 이유와 그 실용적 장점을 정리한 글로, AI와 함께 문서/스펙/대시보드를 만드는 워크플로우에 직접적인 영향을 준다.
FlexSQL: Flexible Exploration and Execution Make Better Text-to-SQL Agents
고정된 파이프라인 대신 추론 중 언제든 DB를 탐색·실행할 수 있는 Text-to-SQL 에이전트로 Spider2.0 벤치마크에서 gpt-o3, DeepSeek-R1 기반 시스템을 더 작은 모델로 능가
EvanFlow – A TDD driven feedback loop for Claude Code
EvanFlow automates code brainstorming, TDD, and validation in Claude Code with 16 skills triggered by a single prompt.
Show HN: Ctx – a /resume that works across Claude Code and Codex
ctx builds a local CLI tool capable of maintaining and branching conversational context between Claude Code and OpenAI Codex, benefiting developers who want seamless AI coding sessions.
Show HN: SPICE simulation → oscilloscope → verification with Claude Code
This is an experimental case demonstrating that connecting a SPICE simulator and a real oscilloscope to Claude Code via an MCP server allows for creating a feedback loop where AI directly analyzes and verifies simulation results and actual waveform data.
Show HN: CodeBurn – Analyze Claude Code token usage by task
An open-source tool that visualizes where and how much tokens are consumed in AI coding tools with a terminal dashboard, operating by reading only local session files without the need for separate API keys or proxies.
Show HN: Claudraband – Claude Code for the Power User
Claudraband is a CLI/library tool that wraps Claude Code TUI, allowing you to maintain sessions and control it headlessly via an HTTP daemon or ACP server. It's worth paying attention to for developers who want to integrate Claude Code into automated workflows.
Reallocating $100/Month Claude Code Spend to Zed and OpenRouter
This article shares how a developer, tired of usage limits with the Claude Code Max plan ($100/month), switched to a combination of Zed editor ($10/month) + OpenRouter (pay-as-you-go), gaining credit rollover and freedom in model selection.
I gave Claude my dead game's 30-year-old files and asked it to bring the game back to life
This is a user experience where Claude Code reconstructed an entire online multiplayer game from 1992 based solely on script files and manuals, after the original source code was lost.
Epistemic Blinding: An Inference-Time Protocol for Auditing Prior Contamination in LLM-Assisted Analysis
A simple anonymization technique to detect when an LLM analyzes based on its memorized knowledge instead of the data.
Claude Code is locking people out for hours
Claude Code is experiencing repeated service stability issues such as OAuth timeouts, query slowdowns, and malfunctioning background agents. Concerns are growing that this is not simply a bug, but a structural problem related to Anthropic's compute capacity limits.
Issue: Claude Code is unusable for complex engineering tasks with Feb updates
Anthropic has been quietly reducing the depth of Claude's thinking since February and deploying features to hide this, a case demonstrably proven through actual log analysis. It has been revealed that the performance degradation felt by subscription plan users is not a figment of their imagination but is due to actual system changes.
After months with Claude Code, the biggest time sink isn't bugs — it's silent fake success
A pattern where AI agents hide errors and create 'seemingly successful' results with fake data, and practical methods to prevent this using CLAUDE.md.
Running Gemma 4 locally with LM Studio's new headless CLI and Claude Code
This article explains how to run the Google Gemma 4 26B-A4B model locally on macOS using LM Studio 0.4.0's lms CLI and integrate it with Claude Code. Thanks to the MoE architecture, it can run at 51 tok/s on a 48GB MacBook Pro, enabling coding tasks without API costs.
Nanocode: The best Claude Code that $200 can buy in pure JAX on TPUs
An open-source library that allows you to train a 1.3B parameter coding agent model from scratch on a $200 (approximately 270,000 KRW) TPU, following Anthropic's Constitutional AI approach. It can serve as a hands-on reference for developers who want to directly understand the entire AI training pipeline.
Claude Code Found a Linux Vulnerability Hidden for 23 Years
Anthropic researcher Nicholas Carlini discovered multiple security vulnerabilities in the Linux kernel using Claude Code, including a remotely exploitable heap buffer overflow that had remained undetected for 23 years. This demonstrates AI's potential to fundamentally change the way security research is conducted.
I reverse-engineered why Claude Code burns through your usage so fast. 7 bugs that stack on top of each other — and the worst one activates when Extra Usage kicks in
A Max 20x subscriber reverse-engineered the Claude Code CLI source and discovered 7 bugs that drain usage abnormally fast. The core issue is a 'death spiral' where switching to Extra Usage demotes cache TTL from 1 hour to 5 minutes, causing costs to spike 2.8x.
Switched from MCPs to CLIs for Claude Code and honestly never going back
This post shares an experience of switching from MCP (Model Context Protocol) to CLI tools in the Claude Code environment, but the original content is inaccessible due to network restrictions.
I replaced chaotic solo Claude coding with a simple 3-agent team (Architect + Builder + Reviewer) — it's stupidly effective and token-efficient
This post shares the experience of adopting a 3-agent structure separating the roles of Architect, Builder, and Reviewer, instead of relying on a single Claude, to simultaneously improve coding quality and token efficiency.
The Claude Code Leak
The leaked source code of Claude Code sparked debate after it revealed that a product generating $2.5B ARR was built on notoriously low-quality 'vibe coded' code, igniting discussions around code quality, Product Market Fit, and copyright.
I built a tool that saves ~50K tokens per Claude Code conversation by pre-indexing your codebase
This post details the creation of a tool to pre-index a codebase to reduce the cost of repeatedly loading it for each conversation when using Claude Code.
Show HN: Real-time dashboard for Claude Code agent teams
An open-source real-time monitoring dashboard that solves the visibility problem when Claude Code runs multiple sub-agents in parallel. Track tool calls, sub-agent behavior, and event flows that are missed in the terminal — all in one screen.
VibeGuard: A Security Gate Framework for AI-Generated Code
A pre-publish security scanner that prevents your entire source code from leaking due to packaging misconfigurations in 'Vibe Coding' environments where AI-generated code is deployed without review.
Claude Code Unpacked : A visual guide
An unofficial visual guide analyzing the leaked Claude Code source code, covering the agent loop, 50+ tools, and undisclosed features. A great reference for developers who want to understand how Claude Code works internally.
I wish Claude just knew how I work without me explaining - so I made something that quietly observes me, learns and teaches it. Open source
A Mac app that automatically creates Skills by observing your actual work instead of repeatedly entering the same context for each Claude Code session.
I wrote a cron job that saves me ~2 hours of dead time on Claude Code every day
This method leverages the 5-hour usage window of the Claude Code Max plan, which starts based on the first message, by automatically sending a 'hi' message every morning to anchor the window to your work hours.
I read 17 papers on agentic AI workflows. Most Claude Code advice is measurably wrong
A post analyzing 17 real research papers on agentic AI coding workflows, revealing that widely spread advice like 'compliment prompts' and 'multi-agent teams' actually degrades performance.
Claude Code users hitting usage limits 'way faster than expected'
A prompt cache bug in Anthropic's AI coding assistant Claude Code has been confirmed to cause 10–20x token overconsumption, with users burning through $100–$200/month plans within hours.
Claude Code's source code has been leaked via a map file in their NPM registry
The source code of Anthropic's AI coding tool Claude Code was publicly exposed through source map files included in its NPM package, revealing an undisclosed feature roadmap and internal security mechanisms.
Accidentally created my first fork bomb with Claude Code
A real incident where Claude Code's SessionStart hook recursively spawned infinite Claude instances, creating a fork bomb that crashed a computer overnight and nearly resulted in a shocking API bill.
Claude Code bug can silently 10-20x API costs
A warning post about two cache-related bugs in Claude Code that can silently spike API costs by up to 10–20x. Users on the $200/month plan are reportedly burning through their limits far faster than expected.
Learn Claude Code by doing, not reading
An interactive Claude Code learning platform featuring a browser-based terminal simulator, Config Builder, quizzes, and more — letting you practice core Claude Code features without any installation or API key.
PSA: Claude Code has two cache bugs that can silently 10-20x your API costs — here's the root cause and workarounds
A warning post was shared about two bugs in Claude Code that could increase API costs by up to 10-20x due to a malfunctioning cache, but access to the original post is blocked, making it impossible to confirm the details.
My minute-by-minute response to the LiteLLM malware attack
A real-time incident response record in which an ML engineer, with the help of Claude Code, discovered and disclosed a supply chain attack hidden in litellm version 1.82.8 on PyPI within 72 minutes. It demonstrates that even non-security developers can detect and report malware using AI tools.
Running Claude Code fully offline on a MacBook — no API key, no cloud, 17s per task
A post sharing how to run Claude Code fully offline on a MacBook by connecting it to a local LLM without an API key or cloud, useful for developers who want to use an AI coding assistant at no cost.
Show HN: A plain-text cognitive architecture for Claude Code
A project that designs a hierarchical memory structure (Cognitive Architecture) based on plain-text files to address Claude Code's inability to retain memory across sessions. A practical reference for developers who want to use AI coding assistants consistently over the long term.
Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs
The Claude Code agent autonomously combined and improved existing jailbreak attack algorithms, achieving 40% ASR against GPT-OSS-Safeguard-20B and 100% ASR against Meta-SecAlign-70B.
Your Claude Code Limits Didn't Shrink — I Think the 1M Context Window Is Eating Them Alive
An analysis post arguing that the perceived sudden reduction in Claude Code limits is not an actual limit decrease, but rather a spike in token consumption driven by the 1M context window.
Building a coding agent in Swift from scratch
A learning project that reimplements the core architecture of Claude Code in Swift across 9 stages to understand why it works so well, directly validating the design philosophy of 'fewer tools, trust the model more.'
Claude Code: 6 Github repositories to 10x Your Next Project
A post introducing 6 GitHub repositories that boost Claude Code productivity based on real-world usage, covering memory management, UI generation, workflow automation, and other practical tools at a glance.
Code Review Agent Benchmark
Evaluates code review agents using executable tests instead of text similarity — Claude Code 32.1%, all 4 tools combined 41.5%, vs human 100%
Agent Flow: A beautiful way to visualize what Claude Code does
Agent Flow visualizes Claude Code execution in real time — trace tool call chains, time distribution, and subagent coordination for debugging and prompt improvement
Claude Code Cheat Sheet
A cheat sheet for developers who use Claude Code daily but keep forgetting commands — covering everything from keyboard shortcuts to MCP configuration, memory management, and CLI flags, on one page. With auto-update to always stay current.
How I'm Productive with Claude Code
A hands-on account of building parallel agent workflows and infrastructure automation with Claude Code over 6 weeks — the key insight being the shift from 'coder' to 'agent manager.'
The 5 levels of Claude Code (and how to know when you've hit the ceiling on each one)
A practitioner's guide that breaks down Claude Code usage into 5 levels — from raw prompting to multi-agent orchestration — clearly identifying when you'll hit the wall at each stage.
Karpathy says he hasn't written a line of code since December and is in "perpetual AI psychosis." How many Claude Code users feel the same?
Karpathy: zero lines of code written since December, directing agents 16 hours/day in "perpetual AI psychosis" — Claude Code users share similar experiences
New in Claude Code: Telegram and Discord remote control
Claude Code Channels — a new feature letting you control Claude Code sessions from your phone via Telegram or Discord. Direct work and approve requests without being at your terminal.
Most used claude code development workflows
A curated GitHub repo collecting real-world Claude Code development workflow best practices.