Show HN: Real-time dashboard for Claude Code agent teams
TL;DR Highlight
An open-source real-time monitoring dashboard that solves the visibility problem when Claude Code runs multiple sub-agents in parallel. Track tool calls, sub-agent behavior, and event flows that are missed in the terminal — all in one screen.
Who Should Read
Developers running Claude Code with multiple agents in parallel or operating complex multi-agent workflows. A perfect fit when you want to understand what's happening in real time rather than digging through post-mortem logs after an agent fails.
Core Mechanics
- When Claude Code autonomously spawns sub-agents and makes tool calls, the terminal only shows a fraction of the total activity. Sub-agents operate essentially invisibly, and when something goes wrong three levels deep, the only option was to sift through logs after the fact — this project solves that problem.
- Instead of OTEL (OpenTelemetry, the distributed tracing standard framework), it uses Claude Code's hook system to capture events. Hooks provide a complete picture of agent behavior and record tool call sequences and their contents as-is.
- Installation is straightforward: `claude plugin marketplace add simple10/agents-observe` → `claude plugin install agents-observe` → restart Claude Code. From the next session onward, a Docker container starts automatically and the dashboard is available at http://localhost:4981. Docker and Node.js are prerequisites.
- To minimize the performance impact of hooks, the implementation uses a background async approach instead of blocking (synchronous) execution. In environments where agents make dozens of tool calls per minute, even 100ms of blocking per hook adds 2–3 seconds to the overall task, and this multiplies with more parallel agents.
- Both local and remote setups are supported. Running the server and dashboard on a remote VM allows multiple Claude Code instances to send events to the same server, enabling monitoring of an entire multi-agent team from a single place.
- The dashboard includes powerful filtering, search, and multi-agent session visualization. You can review which tools each agent called and in what order via a timeline view, allowing you to reconstruct 'how the agent arrived at this conclusion' after the fact.
- Two slash commands are provided — /observe and /observe status — to open the dashboard URL or check server status directly within a Claude Code session.
Evidence
- "A user running multiple Claude Code agents in parallel on a remote VM shared firsthand experience that 'throughput drops sharply when hooks block on the agent critical path.' With agents making dozens of tool calls per minute, hundreds of milliseconds accumulating per hook is very noticeable. They assessed the Docker-based service pattern as the right trade-off for achieving observability without adding overhead to the agents themselves. Multiple people raised the transparency issue that 'there's no way to know whether an agent's self-report matches actual outcomes' in multi-agent operations. When a coordinator spawns builder, reviewer, and tester agents in parallel, the results each agent reports may be 'sanitised optimism,' and it was noted that event stream logs cannot verify whether results are actually correct. A question was raised about concurrency handling when multiple agent teams write to the same JSONL file simultaneously — no concrete answer regarding log file conflict handling in parallel agent environments was confirmed in the thread. There were also surprised reactions to Claude Code usage costs, with comments like 'are you spending hundreds to thousands of dollars a day on Claude tokens?' — suggesting that heavy users running multiple agents in parallel for extended periods are the primary target audience. There were also realistic comments that average developers hit usage limits quickly. Someone asked whether the dashboard tracks the full tree or only one level when sub-agents spawn their own sub-agents (a tree structure) — an important edge case in real production environments — but no clear answer was confirmed in the thread."
How to Apply
- "If you're running multiple agents in parallel with Claude Code and struggle to identify the cause when something goes wrong, install with `claude plugin install agents-observe` and monitor tool call flows in real time at the localhost:4981 dashboard — pinpoint the problem immediately without post-mortem log analysis. If you're operating a Claude Code agent team on a remote VM, bring up the server with Docker Compose and configure multiple instances to send events to the same endpoint to observe distributed agents from a single dashboard. `docker-compose.yml` is already included. If you're skeptical about the reliability of an agent's output, trace the actual path through the event timeline — what files the agent read and what commands it executed — to uncover discrepancies between 'self-reported' behavior and actual actions. If you've tried building your own hook-based monitoring system but abandoned it due to performance issues, reference this project's background async hook pattern to redesign your system so it doesn't block the agent critical path."
Code Example
# Installation
claude plugin marketplace add simple10/agents-observe
claude plugin install agents-observe
# After restarting Claude Code, the Docker container starts automatically
# Dashboard: http://localhost:4981
# Slash commands available within a Claude Code session
/observe # Open dashboard URL + check server status
/observe status # Server health check and URL display
# Run directly with Docker Compose
docker-compose up -dTerminology
Related Papers
Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks
작은 로컬 LLM(8B)에 guardrails(구조적 안전망)를 씌워 멀티스텝 에이전트 작업 성공률을 53%에서 99%까지 올린 Python 프레임워크 Forge 공개. 모델 자체는 건드리지 않고 실행 환경을 강화하는 접근법이라 주목받고 있음.
Mini Shai-Hulud Strikes Again: 314 npm Packages Compromised
2026년 5월 19일, npm 계정 하나가 탈취되어 22분 만에 637개 악성 버전이 배포됐고, echarts-for-react·size-sensor 등 월 수백만 다운로드 패키지들이 감염되어 AWS 자격증명·SSH 키·AI 코딩 에이전트까지 탈취하는 정교한 공급망 공격이 발생했다.
Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep
AI 에이전트가 코드베이스를 탐색할 때 grep+파일 읽기 대신 자연어로 관련 코드 스니펫만 뽑아주는 검색 라이브러리로, 토큰 사용량을 약 98% 줄여준다.
Zerostack – A Unix-inspired coding agent written in pure Rust
Claude Code나 OpenCode처럼 메모리를 수 GB씩 잡아먹는 코딩 에이전트 대신, Rust로 만든 초경량(~8MB RAM) 코딩 에이전트 Zerostack이 공개됐다. 저사양 환경에서도 쓸 수 있고, 직접 만든 유사 프로젝트들과 비교 토론이 활발하게 이뤄지고 있다.
Δ-Mem: Efficient Online Memory for Large Language Models
LLM의 컨텍스트 윈도우를 늘리지 않고도 과거 정보를 효율적으로 기억할 수 있는 경량 메모리 모듈 δ-mem을 제안한 논문. 모델 자체를 바꾸거나 파인튜닝 없이 기존 LLM에 붙여서 장기 기억 성능을 높일 수 있어 에이전트 시스템 개발자에게 관심을 끌고 있다.
How Claude Code works in large codebases
Anthropic이 수백만 줄짜리 모노레포, 레거시 시스템, 수십 개 마이크로서비스 환경에서 Claude Code를 운영한 패턴을 정리한 글이다. RAG 방식 대신 에이전틱 검색을 쓰는 이유와 실제 현장의 한계를 함께 확인할 수 있다.