Angy — AI Agent Orchestration

Designed around orchestration, not conversation

Most AI coding tools give you one agent and one chat. Angy runs each agent against the Anthropic API, Gemini API, or the Claude Code CLI — managed by a coded state machine that dispatches multi-phase pipelines with adversarial verification at every stage.

Pipeline

A coded state machine that drives Plan → Incremental Build → Integration Review → Final Testing. No LLM decides what happens next — the pipeline is deterministic.

Adversarial Counterparts

A persistent counterpart agent independently verifies plans and reviews code. It challenges the architect, audits the builder, and blocks approval until every claim checks out.

Epic Dependencies

Chain epics with dependsOn for prerequisite gates, or runAfter for sequential continuation with branch inheritance. The scheduler respects the full dependency graph.

Git Worktrees

Each epic runs in an isolated git worktree, bypassing repo locks so multiple epics can build on the same repository in parallel without branch conflicts.

Why we built Angy

Angy was born from a simple frustration: existing AI coding tools could generate impressive-looking code, but the results rarely worked end-to-end without significant manual intervention. We wanted a tool that could build an entire product autonomously — from specification to a running application — with the most accurate results possible.

Single-pass code generation breaks down when subsystems need to talk to each other. A beautifully architected state machine that is never connected to the HTTP layer, a schema mismatch between frontend and backend, a Docker config that references the wrong hostname — these are the bugs that only surface when the full stack runs together. Angy's multi-agent pipeline with adversarial verification and iterative build-test cycles catches these integration failures before delivery.

See the benchmarks

We tested Angy against Cursor and Claude Code on a deliberately hard full-stack specification — 12 database tables, a 7-state machine, WebSocket pipelines, and 25+ API endpoints. Angy v2 was the only implementation to pass the docker compose up benchmark without a single manual code fix, with ~9 of 10 pages functional after seeding.

View Benchmarks on GitHub

The Pipeline

The core of Angy is the PipelineRunner — a TypeScript state machine that drives multi-agent builds incrementally, with verification gates at every step. Instead of building everything at once and testing at the end, it splits work into sequential increments and verifies each one before starting the next.

Phase 1

Plan

Architect designs the solution. Counterpart adversarially verifies the plan. Challenge loop until approved.

Phase 2

Incremental Build

Splitter breaks the plan into 4–8 sequential todos. Each todo: Builder implements → Tester verifies → fix loop if needed.

Phase 3

Finalize

Counterpart code review + integration test. Failures generate fix-todos that feed back into Phase 2. Up to 5 cycles.

Why incremental?

When you build 50 files at once, a wrong field name in the data model cascades into broken APIs and a broken frontend. With incremental builds, errors are caught before anyone builds on top of them. The blast radius is one increment deep.

No LLM routing

The pipeline is a coded TypeScript state machine — no LLM decides what step comes next. Each agent runs against the Anthropic API, Gemini API, or Claude Code CLI through a unified provider interface. Structured JSON schema extraction drives verdicts, increment plans, and test results.

Complexity-driven depth

Pipeline depth adapts to epic complexity. Trivial epics skip the architect and counterpart entirely. Large/epic complexity gets 3 architect turns, a persistent counterpart, and an optional design system phase. Medium sits in between.

Adversarial Verification

The counterpart: your built-in skeptic

The counterpart is a persistent agent that acts as an adversarial reviewer throughout the entire pipeline. It independently verifies claims, runs code, and blocks approval until every issue is resolved.

Plan review — Challenges the architect's plan against acceptance criteria. Can correct the plan or reject it entirely, forcing a rewrite.
Code review — Reads actual files, runs the code, and audits the builder's output. Issues are classified as CRITICAL, MAJOR, or NIT.
Fix generation — On REQUEST_CHANGES, the counterpart's issues are turned into fix-todos that feed back into Phase 2 for a fresh builder to resolve.
Session persistence — One counterpart session spans from plan review through all code review cycles, preserving full context across the pipeline.

Specialist roles in the pipeline

Each agent runs against your chosen provider (Anthropic, Gemini, or Claude Code) with a defined role, scoped tool access, and clear deliverables. They collaborate on a shared epic branch without stepping on each other.

Architect

Read-only analysis. Designs the solution, decomposes it into increments, and defines file ownership. Has access to Read, Glob, and Grep.

Builder

Implements code per the plan. Fresh builder per increment — scoped context, no drift. Has Bash, Read, Edit, Write, Glob, Grep.

Counterpart

Adversarial reviewer with full tool access. Independently verifies claims, finds flaws, and can fix bugs directly. Persistent across the pipeline.

Tester

Builds the project, runs tests, performs smoke tests and e2e verification. Validates each increment and runs final integration tests.

Dependency Graph

Chain epics into dependency graphs

Epics aren't isolated units — they form directed graphs of prerequisites and sequential continuations. The scheduler traverses the full dependency tree before dispatching any epic.

dependsOn — Multiple prerequisite epic IDs. The epic is blocked until every dependency reaches Done. DFS traversal detects and prevents circular dependencies.
runAfter — Single predecessor for sequential chaining. Unblocks when predecessor reaches Review or Done. The successor inherits the predecessor's branch and worktree.
Branch inheritance — With runAfter, the successor reuses the same epic branch and worktree directory, building directly on top of the predecessor's code.
Cleanup deferral — Worktrees are kept alive when a runAfter successor exists. Cleanup only happens when the last epic in the chain completes.

Parallel Isolation

Git worktrees for true parallel execution

Instead of checking out branches in the main repository (which locks it for one epic at a time), worktree mode creates isolated working directories so multiple epics can build on the same repo simultaneously.

✦ Each epic gets a dedicated directory at .angy-worktrees/<epic-slug>
✦ Worktree epics bypass repo locks — no contention between parallel pipelines
✦ Checkout-mode epics lock the repo; only one can run per repository at a time
✦ Orphan worktrees are automatically cleaned up on startup via reconcileWorktrees()
✦ Worktree paths are excluded from .git/info/exclude to keep the repo clean
✦ runAfter successors inherit the worktree path, continuing in the same directory

Autonomous Engine

Autonomous Scheduler

A background engine that continuously scores, prioritizes, and dispatches epics across projects — ticking every 30 seconds. It enforces concurrency limits, respects dependency graphs, manages repo locks, and tracks API cost budgets.

Priority scoring — Configurable weights: priority hint (40%), dependency depth (20%), age (15%), complexity (15%), rejection penalty (10%)
Concurrency — Global limit (default 3 parallel epics) plus per-project limits. Worktree epics bypass repo locks
Dependencies — DFS traversal of dependsOn graphs and runAfter chains. Circular deps detected and blocked
Cost budgets — Daily API cost limits enforced per tick from the cost_log table. Scheduling pauses when exceeded
✦ Crash recovery — resumes from last pipeline checkpoint via PipelineStateStore
✦ Human review gate — approve to squash-merge, or reject with feedback to trigger the Fix pipeline

Four pipeline types

Not every epic needs a full build. Angy routes each epic through the right pipeline based on its type and history.

⚙️

Create

Full pipeline — architect → incremental build → counterpart review → final test. For new features and greenfield work.

🔧

Fix

Targeted bug fixing with rejection context. Diagnose → builder fix → tester verify → counterpart review. Triggered automatically on rejected epics.

🔍

Investigate

Read-only codebase analysis. A single architect call that produces findings, evidence, conclusions, and open questions. No code changes.

📋

Plan

Read-only architectural planning. Produces analysis, files to modify/create, and implementation steps. No code changes — just a structured plan.

Three provider backends, one unified interface

Angy supports direct API access to Anthropic and Gemini, plus the Claude Code CLI. Switch providers per agent, per pipeline, or across your entire fleet — the pipeline doesn't care which backend drives the agents.

Anthropic (Direct API)

Calls the Anthropic Messages API directly via the official SDK. Full streaming, tool use, and cost tracking — no CLI dependency required. Supports Claude Sonnet, Opus, and Haiku models.

Gemini (Direct API)

Calls the Google Gemini API directly via the official SDK. Same agentic loop, same tool interface — provider-specific translation handled by the adapter. Supports Gemini 2.5 Pro and Flash models.

Claude Code (CLI)

Spawns a claude CLI process per agent with streaming JSON I/O via Tauri's shell plugin. Leverages Claude Code's built-in tool execution and session management.

Powered by `@angycode/core`

The Anthropic and Gemini backends are powered by @angycode/core — our open-source npm library that implements a complete agentic coding loop: system prompt construction, streaming provider communication, tool execution (Bash, Read, Write, Edit, Glob, Grep, Think, WebFetch), SQLite-backed session persistence, and cost tracking. The library is provider-agnostic and can be embedded in CLIs, servers, or desktop apps.

npm install @angycode/core

Multi-Project

Kanban board across all your projects

Track every epic as it flows through the pipeline — Idea, Todo, In Progress, Review, Done. Multiple repositories, independent branches, unified dashboard.

✦ Drag epics between stages or let the scheduler drive them
✦ Per-project epic queues and scheduler configuration
✦ Color-coded agent status and live pipeline phase indicators
✦ Filter by project, agent type, priority, or dependency status
✦ Reject with feedback to send epics back through the Fix pipeline

Live Output

Watch the fleet work in real time

Every agent session has its own chat panel, Monaco code editor, and Xterm.js terminal. Stream agent output as it happens or come back when the Review column fills up.

✦ Real-time token streaming with thinking blocks and tool call visualization
✦ Diff view for every file modified by every agent
✦ Pipeline agents grouped in the sidebar with phase badges
✦ Tool and file graph — visualize which tools and files each agent touched
✦ Full session replay per epic — see every agent decision
✦ File checkpointing — rewind changes to any prior checkpoint

Start orchestrating today

Angy is open source and free to self-host. Clone the repo, add your Anthropic or Gemini API key in Settings, and run your first epic in minutes.

View on GitHub

A scheduling engine for multi-agent pipelines