Building a coding agent in Swift from scratch
TL;DR Highlight
A learning project that reimplements the core architecture of Claude Code in Swift across 9 stages to understand why it works so well, directly validating the design philosophy of 'fewer tools, trust the model more.'
Who Should Read
Backend or iOS developers who want to build their own LLM-based coding agents, or developers who want to understand at a principled level how AI agents like Claude Code work internally.
Core Mechanics
- The core hypothesis of this project challenges the conventional wisdom that 'more tools make a better agent.' The author argues that Claude Code is effective not because of complex orchestration, but because of a small number of simple yet high-quality tools—like a search tool and a file editing tool—combined with a high degree of trust in the model.
- Five design principles for agent loop architecture are presented: ① A few high-quality tools > a vast tool catalog, ② thin orchestration (let the model decide directly), ③ explicit task state management improves reliability, ④ controlled context injection matters more than persistent memory, ⑤ Context Compaction is not just token savings—it's an actual product feature.
- The implementation is structured as a 9-stage learning series, with each stage designed to isolate and experiment with one mechanism—loop, tool dispatch, state management, sub-agents/skills/context compaction, etc. This allows you to see firsthand which architectural decisions actually make a difference.
- The choice of Swift as the implementation language goes beyond personal preference. Swift's Structured Concurrency (a structured concurrency model based on async/await that clearly manages task lifecycles) naturally aligns with agent loop architecture, and the strong type system reduces the JSON string parsing errors common in other languages when defining tool schemas.
- The project structure consists of Sources, Tests, docs, and skills/example directories, and includes an .env.example file and Package.swift for building directly with Swift Package Manager. A GitHub Actions workflow is also included.
- The author concludes that 'most of the magic is in state management and control flow'—meaning that the design of clearly controlling when and what an agent does matters more than the raw capability of the LLM itself.
- The full 9-part learning series is publicly available at ivanmagda.dev, so you can go beyond reading the code and also review the step-by-step design intentions and experimental results in written form.
Evidence
- "Real-world experience was shared that context management in long sessions is the hardest part. One commenter noted that well before hitting hard context limits, accumulating tool call history causes quality degradation—such as 'let me double-check' loops or ambiguous tool selection. Helpful mitigations included: ① summarizing completed subtask outputs and replacing them with compact working memory blocks, ② aggressively dropping intermediate file-read results after extracting the needed information, and ③ including a clear definition of 'done' state in the initial system prompt. There was also a comment raising legal concerns about naming the CLI component 'claude,' warning it could conflict with Anthropic's trademark and result in a takedown request. A developer shared a link to 'Operator' (github.com/bensyverson/Operator), a Swift library they built to handle the core parts of the agent loop, noting it could save time for others doing similar work. Comments praised the step-by-step build approach for making failure modes explicit—building up through loop → tool dispatch → state management → sub-agents/skills/compaction served as a reminder that 'most of the magic is in state management and control flow.' There was also a question about whether, once Apple Intelligence uses Gemini as its core, you could drop this project in to create a fully local AI agent—no direct answer was given, but it reflects interest in local model integration within the Swift ecosystem."
How to Apply
- "If you want to build your own coding agent like Claude Code, follow this repo's 9-stage series from the beginning and focus on implementing just one mechanism per stage (loop → tool dispatch → context management). This lets you isolate the cause of failures much faster than trying to implement everything at once. If you're experiencing quality degradation in long sessions, instead of keeping the entire tool call history in context, try replacing completed subtasks with summary blocks and immediately dropping file-read results once the information has been extracted. According to commenter experience, this significantly reduces quality issues that arise before hitting hard context limits. When designing a new coding agent, consider improving the quality of existing tools before adding more. As this project's hypothesis suggests, a single accurate search tool and a single accurate file editing tool can outperform 20 sloppy tools—something you can verify directly through experimentation. If you're implementing an agent in Swift, check out the 'Operator' library (github.com/bensyverson/Operator) mentioned in the comments first—it can reduce agent loop boilerplate code."
Terminology
Related Papers
Show HN: OpenKnowledge – open source AI-first alternative to Obsidian/Notion
Git 기반 동기화와 Claude/Codex/Cursor 연동을 내장한 로컬 우선 마크다운 에디터로, AI 에이전트의 두 번째 뇌(LLM Wiki)로 활용할 수 있는 오픈소스 도구다.
The Unfireable Safety Kernel: Execution-Time AI Alignment for AI Agents and Other Escapable AI Systems
AI 에이전트가 자신의 안전장치를 우회할 수 없도록, 에이전트 프로세스 바깥에 수학적으로 증명된 강제 통제 게이트를 배치하는 아키텍처
RubyLLM: A Ruby framework for all major AI providers
OpenAI, Claude, Gemini 등 주요 AI 프로바이더를 단일 인터페이스로 통합한 Ruby 프레임워크로, Rails 통합과 에이전트 기능까지 지원해 Ruby 개발자가 AI 기능을 빠르게 붙일 수 있다.
Qwen-AgentWorld: Language World Models for General Agents
Alibaba Qwen 팀이 AI 에이전트가 행동 결과를 미리 시뮬레이션할 수 있는 'Language World Model'을 공개했다. 에이전트 훈련과 실행 경로 검증에 새로운 패러다임을 제시하는 연구다.
SHERLOC: Structured Diagnostic Localization for Code Repair Agents
버그 위치만 알려주는 게 아니라 '왜, 어떻게 고쳐야 하는지'까지 진단 리포트를 생성해서 코드 수정 에이전트의 성능을 높이는 training-free 프레임워크
Show HN: peerd – AI agent harness that runs entirely in your browser
백엔드 서버 없이 Chrome/Firefox 확장 프로그램으로만 동작하는 AI 에이전트 실행 환경으로, 브라우저 탭을 직접 조작하고 WASM Linux VM까지 구동할 수 있어 프라이버시와 보안을 동시에 챙길 수 있다.