Show HN: Baton – A desktop app for developing with AI agents
TL;DR Highlight
A desktop app that lets you run multiple AI coding agents (Claude Code, Gemini CLI, etc.) simultaneously in separate git worktrees and monitor them all in one place — ideal for developers who want to split work by feature and develop in parallel.
Who Should Read
Developers who want to run multiple AI coding agents like Claude Code or Codex CLI simultaneously and manage the progress of each task from a single interface. Especially suited for those who want to develop multiple features in parallel without branch conflicts.
Core Mechanics
- Baton is a desktop app for running and managing multiple AI coding agents simultaneously (supports all CLI-based agents including Claude Code, Codex CLI, OpenCode, and Gemini CLI), available as a free download for Mac, Windows, and Linux.
- Each task (workspace) is fully isolated via git worktree (a git feature that maintains multiple independent working directories within a single repository), so agents never interfere with or conflict with each other — each works on its own branch without needing to switch branches or use stash.
- The dashboard displays each agent's status with badges: a blue 'Input' badge when waiting for input, a green 'Done' badge when the task is complete, and a red 'Error' badge when an error occurs — no need to check each tab individually. Best support is provided for Claude Code.
- When starting a task, you describe what you want to build and the AI automatically generates a branch name, workspace title, and description. Enabling 'Accept Edits' mode lets the agent start working immediately without permission prompts.
- A built-in diff viewer based on Monaco editor (the code editor component used in VS Code) lets you review agent-made changes file by file before opening a PR, with the ability to roll back individual files. A 'Live follow mode' is also supported for tracking changes in real time while the agent is working.
- A built-in MCP (Model Context Protocol, the standard protocol for AI agents to call external tools) server allows agents to directly create new workspaces or launch parallel tasks during a conversation.
- Additional code review utilities are built in, including fuzzy file search and full-text content search powered by fzf and ripgrep, git blame, and per-file commit history. Frequently used shell commands or agent prompts can be saved as 'Actions' for reuse.
Evidence
- "There were criticisms that Baton's differentiators weren't clear given the large number of similar open-source agent managers emerging, with tools like Conductor, superset.sh, t3.codes, and cmux mentioned as alternatives — one commenter even noted that Claude Desktop itself has supported worktree-based parallel agents for over a month. There was also criticism that these agent managers are essentially rebuilding IDEs, with the argument that improving VS Code would be more practical since it already runs as a web app in containers, supports workspaces, and has an extension ecosystem (visualJJ, a worktree/workspace manager, was also mentioned). Practical questions arose about the cost of running multiple Claude Code agents simultaneously, with comments asking whether users were expensing it to their company — indicating that cost is a significant real-world barrier. More fundamental questions were raised about what people are actually building with agents, worktrees, and harnesses. Commenters shared that most use cases stay at the level of generating boilerplate components for frameworks like React or Laravel, or small personal apps, with one person describing using agents to remove dead code from large codebases as a time-saving task. There was also UX feedback about the site's design — one commenter said they gave up reading within 30 seconds due to a TV-noise background effect and flickering thin blue lines — and separately, someone shared a similar terminal-based tool they had built and published on GitHub (agent-storm)."
How to Apply
- "If you need to develop multiple features simultaneously with Claude Code, install Baton and create a workspace per feature — each agent works on its own independent git branch, enabling parallel development without conflicts, and you can review changes with the diff viewer and open a PR when done. If you find yourself constantly switching terminal tabs to check whether an agent has finished, use Baton's status badges and dock notifications — you'll be alerted the moment an agent reaches a completed, error, or input-waiting state, so you can check back while doing other work. If you have frequently used agent run options (e.g., flags like --dangerously-skip-permissions) or project initialization commands, save them with Custom Agent Presets and Workspace Setup so you don't have to re-enter them every time you create a new workspace."
Code Example
# Installing via AppImage on Linux
sudo apt install fuse libfuse2 # Debian/Ubuntu
sudo dnf install fuse fuse-libs # Fedora
chmod +x baton-*.AppImage
./baton-*.AppImage
# Verifying download integrity
# macOS
shasum -a 256 [file]
# Linux
sha256sum [file]
# Windows (PowerShell)
Get-FileHash [file] -Algorithm SHA256Terminology
Related Papers
Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks
작은 로컬 LLM(8B)에 guardrails(구조적 안전망)를 씌워 멀티스텝 에이전트 작업 성공률을 53%에서 99%까지 올린 Python 프레임워크 Forge 공개. 모델 자체는 건드리지 않고 실행 환경을 강화하는 접근법이라 주목받고 있음.
Mini Shai-Hulud Strikes Again: 314 npm Packages Compromised
2026년 5월 19일, npm 계정 하나가 탈취되어 22분 만에 637개 악성 버전이 배포됐고, echarts-for-react·size-sensor 등 월 수백만 다운로드 패키지들이 감염되어 AWS 자격증명·SSH 키·AI 코딩 에이전트까지 탈취하는 정교한 공급망 공격이 발생했다.
Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep
AI 에이전트가 코드베이스를 탐색할 때 grep+파일 읽기 대신 자연어로 관련 코드 스니펫만 뽑아주는 검색 라이브러리로, 토큰 사용량을 약 98% 줄여준다.
Zerostack – A Unix-inspired coding agent written in pure Rust
Claude Code나 OpenCode처럼 메모리를 수 GB씩 잡아먹는 코딩 에이전트 대신, Rust로 만든 초경량(~8MB RAM) 코딩 에이전트 Zerostack이 공개됐다. 저사양 환경에서도 쓸 수 있고, 직접 만든 유사 프로젝트들과 비교 토론이 활발하게 이뤄지고 있다.
Δ-Mem: Efficient Online Memory for Large Language Models
LLM의 컨텍스트 윈도우를 늘리지 않고도 과거 정보를 효율적으로 기억할 수 있는 경량 메모리 모듈 δ-mem을 제안한 논문. 모델 자체를 바꾸거나 파인튜닝 없이 기존 LLM에 붙여서 장기 기억 성능을 높일 수 있어 에이전트 시스템 개발자에게 관심을 끌고 있다.
How Claude Code works in large codebases
Anthropic이 수백만 줄짜리 모노레포, 레거시 시스템, 수십 개 마이크로서비스 환경에서 Claude Code를 운영한 패턴을 정리한 글이다. RAG 방식 대신 에이전틱 검색을 쓰는 이유와 실제 현장의 한계를 함께 확인할 수 있다.