Specsmaxxing – On overcoming AI psychosis, and why I write specs in YAML

TL;DR Highlight

Structuring acceptance criteria in YAML with the acai.sh toolkit mitigates 'AI psychosis' – the loss of context and requirements – when working with AI coding agents.

Who Should Read

Developers using AI coding agents (Cursor, Claude, etc.) in production who struggle with agents forgetting requirements or generating incorrect code due to session resets or context window limits. Particularly useful for solo developers or small teams seeking to bridge the gap between specification management and AI code quality.

Core Mechanics

Working with AI agents frequently results in lost requirements and 'derailment' due to context window limitations, session terminations, or machine switching. The author terms this 'AI psychosis'.
While markdown documents like README.md are helpful, the author experimented with managing structured acceptance criteria in YAML.
A key insight is that 'specs must exist somewhere'. If not documented, they reside in developers' heads or conversations, but teams and businesses ultimately judge based on those specs. Therefore, documenting them immediately is beneficial.
During experimentation, a sub-agent spontaneously began adding requirement numbers (AUTH-1, AUTH-2, AUTH-3, etc.) to code comments, inspiring a systematic approach to linking YAML-based specs with code.
The author created the open-source toolkit acai.sh, with a workflow consisting of four stages: 'Specify → Ship → Review → Iterate'. The feature.yaml file lists acceptance criteria, which the agent references to generate code.
The author identifies 'building an AI harness to build a product' as a form of 'AI psychosis'. Recognizing this trap, they abandoned complex multi-agent architectures in favor of a simpler, feature.yaml-centric approach.
GitHub Spec, Kit, OpenSpec, Kiro, and Traycer.ai were considered as benchmarks, and their differences were outlined. acai.sh's differentiator is its structured acceptance criteria ID-based tracking and code comment linking.
The future roadmap is 'Specsmaxxing → Testmaxxing → Reactive Software Factory', aiming for automatic conversion of spec diffs into code diffs.

Evidence

"The author directly summarized the core concept in a comment: 'Specs must live somewhere. They live in your head or in conversation, and teams and businesses always judge based on specs. So just write them down. feature.yaml is just a list of acceptance criteria.'"

How to Apply

If you experience requirements loss when AI agent sessions disconnect or contexts reset, maintaining feature.yaml files with numbered acceptance criteria (AUTH-1, AUTH-2, etc.) allows the agent to consistently reference requirements across sessions.
To track which requirements an agent implemented in generated code, explicitly instruct the agent prompt to include requirement IDs (e.g., AUTH-1) in code comments, maintaining a link between code and specs. Sub-agents may even automate this process.
When tempted to build complex multi-agent pipelines, heed the author's lesson: first ask yourself if you're 'building an AI harness to build AI' and consider starting with a simple, structured file like feature.yaml.
If your team requires AI code review or handoff, install the acai.sh open-source toolkit (https://acai.sh) and integrate the Specify → Ship → Review → Iterate workflow into your team's processes.

Code Example

snippet

# feature.yaml example (acceptance criteria list)
feature: authentication
requirements:
  - id: AUTH-1
    description: Accepts `Authorization: Bearer <token>` header
  - id: AUTH-2
    description: Tokens are user-scoped, providing access to any of the user's resources
  - id: AUTH-3
    description: Rejects with 401 Unauthorized
    depends_on: AUTH-1

# Example of requirement IDs linked to code comments
const authHeader = req.headers["authorization"]; // AUTH-1
const isAuthorized = verifyBearerToken(authHeader); // AUTH-2
if (!isValid) return res.status(401).json({ error: "Unauthorized" }); // AUTH-3

Terminology

Acceptance CriteriaA list of conditions that must be met for a software feature to be considered 'complete'. Written in a testable format, such as 'Pressing the login button issues a token'.

Context WindowThe maximum amount of text an LLM can remember and process at once. Exceeding this limit causes the LLM to forget earlier content.

Vibe CodingAn improvisational coding style where code is created through conversation with AI without clear specifications. It's fast but prone to inconsistency.

BDD (Behavior-Driven Development)A development methodology that describes software behavior in human-readable language and connects it to automated tests. Cucumber is a popular tool.

V-modelA software development process that symmetrically connects each development stage to a corresponding verification stage, from requirements definition to testing.

AST (Abstract Syntax Tree)A tree-like representation of code or markup that makes it easier for computers to process. Parsing YAML into an AST allows LLMs to understand the structure more accurately than reading it as text.