Show HN: Browser Harness – Gives LLM freedom to complete any browser task
TL;DR Highlight
Browser Harness builds self-healing browser automation by letting LLMs write missing functions directly into a Python script, enabling control of a real browser with a single prompt to Claude Code or Codex.
Who Should Read
Developers aiming to implement browser automation (scraping, RPA, repetitive web tasks) using LLMs, or those exploring agent-based approaches as an alternative to existing Playwright/Puppeteer frameworks.
Core Mechanics
- Browser Harness connects directly to Chrome’s CDP (Chrome DevTools Protocol) via a single websocket, bypassing intermediate frameworks like Playwright for a streamlined architecture (approximately 592 lines of Python code).
- The ‘Self-healing’ concept works like this: if an LLM needs a function (e.g., upload_file()) that’s missing in helpers.py, it writes and adds that function to helpers.py and continues the task—effectively creating tools on demand.
- Setup is remarkably simple: pasting the prompt from the README into Claude Code or Codex causes the agent to install, read SKILL.md and helpers.py, and immediately begin controlling a real browser.
- Domain-specific tasks are contained in the domain-skills/ directory, while common browser interaction functions are organized in interaction-skills/, providing a reference for the agent.
- A free remote browser feature is available: obtaining an API key from cloud.browser-use.com grants access to three concurrent browsers, including proxy and captcha pool support for stealth automation or sub-agent deployment.
- The agent can even handle API key acquisition: docs.browser-use.com/llms.txt contains a setup flow and challenge context for LLMs, allowing agents to complete the registration process autonomously.
- The project has garnered significant community interest with 6.4k GitHub Stars and 567 Forks, and is actively developed with 17 open issues and 68 open PRs.
Evidence
- "Community feedback flagged the README’s ‘curl URL | sh’ setup prompt as a risky practice akin to blindly executing commands. Concerns arose about the structural risk of an agent following instructions from an untrusted repository. A security researcher reported a remote code execution (RCE) vulnerability (GHSA-r2x7-6hq9-qp7v) to the browser-use project approximately 40 days ago, but received no response, raising concerns about the security response process. Discussions questioned whether this project represents a truly new paradigm, with some arguing it’s simply another form of ‘agentic coding’ sharing the same ‘harness + LLM + tool’ structure as JSON schema tools, MCP, or HTTP APIs. Questions about the differences between this project and Sawyer Hood’s dev-browser (github.com/SawyerHood/dev-browser) surfaced, with a lack of a comparative table making it difficult to determine which is better in specific cases. Concerns were raised about the inherent vulnerability to prompt injection attacks due to the LLM’s real-time code writing, with a scenario presented where a malicious webpage could instruct the agent to transfer funds."
How to Apply
- "If you’re using Claude Code or Codex and want to automate repetitive web tasks (form submissions, data collection, clicking buttons after login), simply pasting the setup prompt from the README into the agent will immediately configure an automation agent connected to a real Chrome browser. If you’re maintaining Playwright scripts and facing high costs due to frequent code changes from DOM updates, adopting Browser Harness’s self-healing approach can reduce maintenance overhead as the LLM adds or modifies functions as needed. For tasks requiring simultaneous scraping of multiple sites or proxy/captcha bypass, obtaining a free API key from cloud.browser-use.com allows you to operate three remote browsers concurrently, reducing local resource burden. When applying this tool to internal automation, prioritize security considerations: LLM-visited webpages may contain prompt injection attacks, so exercise caution when using it with financial or personal information, and initially leverage sandboxed environments or read-only accounts."
Code Example
# Prompt to paste into Claude Code or Codex
Set up https://github.com/browser-use/browser-harness for me.
Read `install.md` first to install and connect this repo to my real browser.
Then read `SKILL.md` for normal usage.
Always read `helpers.py` because that is where the functions are.
When you open a setup or verification tab, activate it so I can see the active browser tab.
After it is installed, open this repository in my browser and,
if I am logged in to GitHub, ask me whether you should star it for me as a quick demo
that the interaction works — only click the star if I say yes.
If I am not logged in, just go to browser-use.com.
# Self-healing example (from README)
● agent: wants to upload a file
│ ● helpers.py → upload_file() missing
│ ● agent edits the harness and writes it
helpers.py 192 → 199 lines
│ + upload_file() ✓ file uploadedTerminology
Related Papers
Ask HN: What are tools you have made for yourself since the advent of AI?
Hacker News 커뮤니티에서 AI를 활용해 개발자들이 직접 만들어 쓰는 개인 도구들을 공유한 스레드로, '하이퍼-퍼스널 소프트웨어' 트렌드를 잘 보여준다.
Config Files That Run Code: Supply Chain Security Blindspot
VS Code, Cursor, Claude Code, npm 등 널리 쓰이는 도구들이 config 파일에 담긴 shell 명령을 자동 실행하는 구조를 악용한 공급망 공격 사례를 분석한 글로, 개발자가 저장소를 clone하고 에디터를 여는 순간 공격자 코드가 실행될 수 있다.
Show HN: Lathe – Use LLMs to learn a new domain, not skip past it
LLM이 대신 코드를 짜주는 게 아니라, 직접 손으로 따라할 수 있는 실습형 튜토리얼을 생성해주는 CLI 도구다. AI에게 생각을 맡기는 대신 배움의 도구로 활용하는 접근법이라 주목받고 있다.
Meta confirms 1000s of Instagram accounts were hacked by abusing its AI chatbot
Meta의 AI 챗봇에 있던 이메일 검증 버그로 인해 2FA(2단계 인증)를 사용하지 않던 Instagram 계정 2만 개 이상이 약 2개월간 해킹됐다. AI를 계정 복구 시스템에 통합할 때 발생할 수 있는 보안 취약점의 실제 사례다.
DuMate-DeepResearch: An Auditable Multi-Agent System with Recursive Search and Rubric-Grounded Reasoning
Baidu가 만든 Deep Research 멀티에이전트 프레임워크로, DAG 기반 동적 플래닝 + 재귀 검색 에이전트 + Rubric 스캐폴딩을 조합해 두 벤치마크에서 SOTA를 달성했다.
Anthropic's open-source framework for AI-powered vulnerability discovery
Anthropic이 Claude를 활용해 코드 취약점을 자율적으로 탐지·트리아지·패치하는 오픈소스 레퍼런스 구현체를 공개했다. 실제 보안팀과의 협업 경험을 바탕으로 만들어진 파이프라인이라 실전 적용성이 높다.