Launch HN: Freestyle – Sandboxes for Coding Agents
TL;DR Highlight
Sandbox infrastructure designed to allow AI coding agents to run tens of thousands of VMs concurrently, with core features including VM startup within 700ms, forking (cloning) of running VMs, and Pause/Resume functionality.
Who Should Read
Engineers developing services where AI directly generates and executes code, such as Devin, Cursor Agent, Lovable, and Bolt, or backend developers operating AI code review bots or test automation in CI/CD.
Core Mechanics
- Freestyle is a VM Sandbox service dedicated to AI coding agents, providing an immediate startup time of under 700ms from API request to VM readiness.
- The most differentiating feature is 'Live Forking,' which allows you to clone a running VM entirely without stopping it. For example, you can fork a single VM into three and assign 'API endpoint implementation,' 'frontend UI implementation,' and 'test suite writing' to the AI in parallel.
- Pause & Resume functionality allows you to pause a VM, resulting in zero cost, and resume it exactly where it left off when the next execution request arrives. With a `idleTimeoutSeconds: 60` setting in the code example, it automatically pauses after 60 seconds of inactivity.
- Unlike competing services that only fork the filesystem, Freestyle explicitly states that it forks the entire VM memory (RAM state). This enables agents to explore multiple directions from an intermediate execution state (branch-and-explore).
- It also includes built-in Git repository management, allowing agents to store generated code in Freestyle's own Git repo, synchronize bidirectionally with GitHub, and configure Webhooks with fine-grained control by branch, path, and event type.
- In a code review bot use case, `bun run lint` and `bun test` are executed in the VM, after which the AI reviews the diff and automatically posts 'REQUEST_CHANGES' or 'APPROVE' to the GitHub PR depending on test failures.
- The infrastructure is built on top of its own bare-metal servers to reduce cloud virtualization overhead and supports low-level network features like eBPF and XDP. It was stated that the Sandbox is isolated outside the main VPC for security.
- The JS Sandbox API was already available, and this launch adds a full VM-based Sandbox. It natively supports Node.js/Bun runtimes and automatic execution of development servers (`bun run dev`).
Evidence
- The most interest was in the memory forking feature. One comment stated that forking the entire VM memory during runtime is a different approach than competitors copying only the filesystem, and expressed hope that if implemented with Copy-on-Write, the complexity would be O(1) and costs would not increase regardless of machine size.
- There was a comment from a team operating thousands of Sandboxes on Azure, GCP, and AWS using standard VMs, who were unclear what Freestyle offers compared to standard VMs. A key question was whether the forking feature requires the agent code to be modified to recognize forking, or if it operates transparently.
- Several comments requested comparisons with competitors. E2B, Daytona, Modal, Blaxel, Vercel, Cloudflare, and Fly Sprites were mentioned, and there were many requests for a price and performance comparison matrix.
- There was criticism that the 50 concurrent VM limit is low. A team that built a similar service in-house shared that maintaining a warm pool of Firecracker VMs allows for immediate Sandbox provisioning without boot time.
- The lack of Windows support was mentioned. Currently, all Sandbox platforms, including Freestyle, are Linux-only, creating a gap for automating enterprise software (ERP, etc.) workflows that require Windows.
How to Apply
- When creating services that automatically generate apps with AI, like Lovable, Bolt, and V0, you can create a template repo with `freestyle.git.repos.create()` and set up `VmDevServer` to configure an environment where the development server automatically starts as soon as the AI generates code, all through API calls.
- When you want to process a single task in parallel, like with Devin and Cursor Agent, you can clone the running VM with `vm.fork({ count: 3 })` and assign different tasks to each fork simultaneously using `Promise.all` to significantly reduce the overall task time.
- When adding an AI code review bot to a GitHub PR, you can generate an AI review based on the results of running `vm.exec('bun run lint')` and `vm.exec('bun test')` in the VM, and then conditionally post 'REQUEST_CHANGES' if the tests fail, creating a CI-integrable automated review pipeline.
- When operating an AI coding assistant that interacts with users, setting `persistence: { type: 'persistent' }` and `idleTimeoutSeconds: 60` will eliminate costs during idle periods between conversations and automatically resume the VM in its previous state when the next message arrives, minimizing costs while maintaining session state.
Code Example
// Parallel agent forking example (Devin, Cursor Agent style)
import { freestyle } from "freestyle-sandboxes";
import { VmBun } from "@freestyle-sh/with-bun";
const { vm } = await freestyle.vms.create({
git: {
repos: [
{ repo: "https://github.com/user/repo.git" },
]
}
});
// Clone the running VM into 3 copies
const { forks } = await vm.fork({ count: 3 });
// Assign different tasks to each fork in parallel
await Promise.all([
ai(forks[0], "Build the API endpoints"),
ai(forks[1], "Build the frontend UI"),
ai(forks[2], "Write the test suite"),
]);
// AI code review bot example
const { stdout: lint } = await vm.exec("bun run lint");
const { stdout: test } = await vm.exec("bun test");
const review = await ai(vm, "Review the diff for bugs");
await github.pulls.createReview({
body: review,
event: test.includes("FAIL") ? "REQUEST_CHANGES" : "APPROVE",
});
// Persistent VM + automatic Pause example
const { vm: persistentVm } = await freestyle.vms.create({
persistence: { type: "persistent" },
idleTimeoutSeconds: 60, // Automatically pauses after 60 seconds of inactivity, cost 0
});
while (true) {
const userMessage = await getNextMessage();
const result = await ai(persistentVm, userMessage);
await respond(result);
}Terminology
Related Papers
Formal Verification Gates for AI Coding Loops
AI가 생성한 코드에서 보안 불변식(invariant)을 지키게 하려면 프롬프트 지시보다 타입 시스템 같은 구조적 제약이 훨씬 효과적이라는 주장과 구현 방법을 소개한다.
Learnings from 100K lines of Rust with AI (2025)
Azure RSL(분산 합의 라이브러리)을 Rust로 재구현하면서 AI 코딩 에이전트를 활용해 4주 만에 100K 라인을 작성한 경험담으로, Code Contracts와 Spec-Driven Development를 AI와 조합하는 실전 워크플로우를 공유한다.
A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents
LLM agent가 왜 터지는지 이름 붙이고, 어떤 아키텍처 패턴을 언제 써야 하는지 5단계로 정리한 실전 가이드
Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks
작은 로컬 LLM(8B)에 guardrails(구조적 안전망)를 씌워 멀티스텝 에이전트 작업 성공률을 53%에서 99%까지 올린 Python 프레임워크 Forge 공개. 모델 자체는 건드리지 않고 실행 환경을 강화하는 접근법이라 주목받고 있음.
Mini Shai-Hulud Strikes Again: 314 npm Packages Compromised
2026년 5월 19일, npm 계정 하나가 탈취되어 22분 만에 637개 악성 버전이 배포됐고, echarts-for-react·size-sensor 등 월 수백만 다운로드 패키지들이 감염되어 AWS 자격증명·SSH 키·AI 코딩 에이전트까지 탈취하는 정교한 공급망 공격이 발생했다.
Code as Agent Harness
LLM 에이전트에서 코드를 단순 출력물이 아닌 추론·행동·환경 모델링의 실행 인프라로 재정의한 102페이지짜리 서베이