Show HN: I put an AI agent on a $7/month VPS with IRC as its transport layer
TL;DR Highlight
A developer shares how they built an AI agent for their portfolio site using IRC as the transport layer — enabling direct GitHub code analysis and visitor Q&A — running on a $7/month VPS. Going beyond the typical 'AI chatbot portfolio' that simply feeds a resume into an LLM, this system provides concrete answers grounded in the actual codebase, making it a noteworthy practical example of AI agent architecture design.
Who Should Read
Full-stack or backend developers who want to add an AI agent to a personal portfolio or small-scale service but are concerned about cost and security. Especially those curious about how to actually implement multi-agent architecture and tiered inference cost optimization strategies in practice.
Core Mechanics
- Most portfolio AI chatbots simply feed resume content into an LLM and have visitors reconstruct it — the author calls this a 'magic show.' Instead, they built an agent that clones actual GitHub repos, reads CI configurations, and answers with specific metrics.
- To establish clear security boundaries, the agent was split in two. The public-facing nullclaw runs on a $7/month VPS with access only to public GitHub repos and portfolio context, while the private ironclaw runs on a separate server connected via Tailscale and handles email, calendar, and personal context. This boundary ensures personal data remains safe even if the public box is compromised.
- There were three reasons for choosing IRC as the transport layer: aesthetics that match the terminal UI of the portfolio site, full ownership of the stack with no platform dependency, and the fact that IRC is a battle-tested 30-year-old protocol. Discord or Telegram can change their API policies at any time, but IRC has no vendor lock-in.
- Model selection was intentionally tiered. The hot path — greetings and simple questions — uses Claude Haiku 4.5 (fast and cheap, a few cents per conversation), and only escalates to Claude Sonnet 4.6 when cloning repos or analyzing multiple files is required. The philosophy is: 'pay for reasoning only when reasoning is needed.'
- Cost control was built into the core of the design. Hard caps of $2/day and $30/month are set so that even if someone intentionally tries to exhaust the API budget, there's a limit. nullclaw operates in sandbox mode with a 10-action-per-hour limit and only read-only tools allowed.
- The tech stack is impressively lightweight. nullclaw is a 4MB Zig binary using only ~1MB RAM, the IRC server ergo is a Go binary at 2.7MB RAM, and the web IRC client gamja is 152KB built. Cloudflare sits in front so visitors never directly reach the server, handling TLS termination, rate limiting, and bot filtering.
- Security hardening was applied at the perimeter box level. SSH uses a non-root user, key authentication, and a non-standard port; UFW opens only three ports (SSH/IRC with TLS/HTTPS); Let's Encrypt auto-renews certificates; security updates are applied automatically; and all tool calls are audit-logged. The box has exactly two roles (ergo + nullclaw), keeping the attack surface minimal.
Evidence
- "Regarding the author's claim that 'even if the public box is compromised, the blast radius is limited to a $2/day IRC bot,' commenters pushed back noting that since nullclaw can route to ironclaw, access to email and personal data is actually possible in practice. There were also serious concerns that because the chat is a public lobby where all visitors can see each other's messages, it could become a hub for distributing illegal content — and eyewitness accounts were shared of the chat going 'completely out of control' during testing. On the Haiku/Sonnet model choices, commenters pointed out that cheaper alternatives exist on OpenRouter: MiniMax M2.7 at $0.30/M input tokens and Kimi K2.5 at $0.45/M, compared to Haiku 4.5's $1/M, with comparable or better performance for most tasks. Technical criticism was raised about IRC's at-most-once message delivery — if the agent disconnects, messages sent in the interim are lost, which is fine for casual conversation but insufficient for an agent handling real tasks, where at-least-once delivery guarantees are needed. SSE (Server-Sent Events) or HTTP polling with ack-based deduplication were suggested as better alternatives; one team that built a similar multi-agent architecture on FastAPI + SQLite reported about 50 agent crashes per day, with dedup state persistence being the first problem they hit. On security, there was sharp criticism that prompt injection defenses amounting to 'write don't do this in the system prompt' don't constitute real security. Additionally, unattended security upgrades were flagged as a potential security risk in themselves — pointing to a recent litellm library security incident as an example of how auto-updates can become an attack vector. Finally, there were real-world observations that the $2/day cost cap turned out to be the 'Achilles heel' — reports came in that the bot had already stopped responding shortly after the post was shared. Suggestions included caching frequently asked questions or leveraging API free tiers to reduce costs, though the daily hard cap approach was also praised as 'smart,' with positive recognition that it correctly identifies cost governance problems that AI coding tools often solve at the wrong layer."
How to Apply
- "If you want to add an AI chatbot to a portfolio or personal project, provide actual GitHub repo URLs as context instead of resume text and configure the agent to clone and read the repo — this enables it to answer specific questions like 'what's the CI coverage percentage?' or 'what testing framework does this use?' based on real code. If you're running a public service and worried about LLM API cost spikes, apply a tiered inference structure like the author's: use a small model like Haiku for the hot path (greetings, simple questions) and escalate to a Sonnet-class model only when actual analysis is needed, then set a hard cap of $2–$5/day at the API level to limit damage from abuse. If you need to enforce access boundaries between public and private data in a multi-agent system, adopt an architecture like nullclaw/ironclaw — physically separating agents onto different servers and allowing only internal communication via Tailscale — so that even if the public agent is compromised, the path to private data is cut off. Note, however, that the routing path between the two agents can itself become an attack vector, so the conditions for accessing ironclaw must be designed strictly. If you're concerned about an agent's filesystem access and execution permissions, combine workspace-directory-scoped file access, a command allowlist with only read-only tools, and an action rate limit (e.g., 10 actions/hour) to run in supervised mode — this limits what an attacker can do even if the agent is hijacked."
Terminology
Related Papers
Show HN: adamsreview – better multi-agent PR reviews for Claude Code
Claude Code에서 최대 7개의 병렬 서브 에이전트가 각각 다른 관점으로 PR을 리뷰하고, 자동 수정까지 해주는 오픈소스 플러그인이다. 기존 /review나 CodeRabbit보다 실제 버그를 더 많이 잡는다고 주장하지만 커뮤니티에서는 복잡도와 실효성에 대한 회의론도 나왔다.
How Fast Does Claude, Acting as a User Space IP Stack, Respond to Pings?
Claude Code에게 IP 패킷을 직접 파싱하고 ICMP echo reply를 구성하도록 시켜서 실제로 ping에 응답하게 만든 실험으로, 'Markdown이 곧 코드이고 LLM이 프로세서'라는 아이디어를 네트워크 스택 수준까지 밀어붙인 재미있는 사례다.
Show HN: Git for AI Agents
AI 코딩 에이전트(Claude Code 등)가 수행한 모든 툴 호출을 자동으로 추적하고, 어떤 프롬프트가 어느 코드 줄을 작성했는지 blame까지 가능한 버전 관리 도구다.
Principles for agent-native CLIs
AI 에이전트가 CLI 도구를 더 잘 사용할 수 있도록 설계하는 원칙들을 정리한 글로, 에이전트가 CLI를 도구로 활용하는 빈도가 높아지면서 이 설계 방식이 실용적으로 중요해지고 있다.
Agent-harness-kit scaffolding for multi-agent workflows (MCP, provider-agnostic)
여러 AI 에이전트가 서로 역할을 나눠 협업할 수 있도록 조율하는 scaffolding 도구로, Vite처럼 설정 없이 빠르게 멀티 에이전트 파이프라인을 구성할 수 있다.
Show HN: Tilde.run – Agent sandbox with a transactional, versioned filesystem
AI 에이전트가 실제 프로덕션 데이터를 건드려도 롤백할 수 있는 격리된 샌드박스 환경을 제공하는 도구로, GitHub/S3/Google Drive를 하나의 버전 관리 파일시스템으로 묶어준다.