aiorch — Parallel AI agents that ship reviewed PRs

Why aiorch

Most coding tools give you code.
We give you a reviewed PR.

The gap between "AI wrote this" and "a human should merge this" is where senior engineering time disappears. aiorch closes it — by making agents review each other before you ever see the diff.

Single-agent tools

Generates a diff. Hands it to you.

You still need to read it, run it, reason about edge cases, reject half of it, ask for revisions, and open the PR yourself. The review burden stays with the human.

aiorch

Generates, reviews, revises, merges — then opens the PR.

A dedicated reviewer agent critiques each agent's output, sends work back for revision, and only approves once the code passes. You review reviewed code, not draft code.

The code was already reviewed before you saw it. That is the product.

Cross-provider routing

Structurally impossible for
Claude Code or Codex.

Claude Code can only orchestrate Claude models. Codex can only orchestrate OpenAI models. By design, neither can give you cross-brand routing. This isn't a feature gap they'll close — it's a commercial conflict. Anthropic will never natively support GPT-5. OpenAI will never natively support Claude. We will.

One session, any model

Hard tasks

→

Opus / GPT-5.5

Medium tasks

→

Sonnet / GPT-5.4

Boilerplate

→

Haiku / Ollama local

All in a single session. Automatic. Per-agent.

Team-level output, individual cost

Claude Code Teams 3–4×

more tokens for the same work vs individual sessions

aiorch 1×

multi-agent collaboration at individual session cost

aiorch delivers team-style multi-agent orchestration — parallel coding, adversarial review, automatic merge — at the inference cost of individual sessions. No inflated team pricing.

Architecture

Why autonomous works here.
Three review layers, not one.

Earlier autonomous coding tools failed because one agent's mistakes went unchecked until the user saw a broken PR. aiorch replaces human-in-the-loop intervention with adversarial agents at three independent layers — the same problem, a different solution.

01 Reviewer agent in-loop

A separate agent evaluates each coder's output against the original task specification, requests revisions, and only approves once the work passes. Coders never merge their own code. Most issues are caught and corrected before they leave the agent's branch.

02 Independent external auditor

A second-pass reviewer with no shared context audits the merged result, identifies issues the in-loop reviewer missed, and performs final cleanup. Independent context is what makes the audit adversarial rather than confirmatory.

03 Integration verification

Tests run across all merged branches. Conflicts are detected and resolved automatically. The PR only opens if every check passes. If anything fails, the session does not produce a PR — it surfaces the failure for review.

Three checkpoints between the prompt and the merge button. The user sees only the final result.

Pipeline orchestration

Multi-phase work,
not just single tasks.

Real engineering work isn't one prompt. It's a sequence of dependent phases — refactor the schema, migrate the callers, update the tests, deprecate the old API. aiorch chains sessions into pipelines that execute sequentially, with each phase handing off cleanly to the next.

pipeline · dynamic-orchestration-redesign 11 phases · 61 agents · all merged

Phase 1 orchestrator-executor sequential merged 7 agents

Phase 2 test-client sequential merged 5 agents

Phase 3 flip-default sequential merged 4 agents

Phase 4 delete-old sequential merged 7 agents

Phase 5 turn-overlap-pause sequential merged 6 agents

Phase 6 stt-time sequential merged 5 agents

Phase 7 caller-profiles sequential merged 5 agents

Phase 8 ssml-rate sequential merged 6 agents

Phase 9 sentiment sequential merged 5 agents

Phase 10 backchannels sequential merged 6 agents

Phase 11 speculative-stretch sequential merged 5 agents

The pipeline above is a real architectural refactor delivered by aiorch — 11 sequential phases, 61 agents, all merged without manual intervention. Pipelines are how aiorch handles work that would normally span multiple sprints.

Live session

One task. Four agents.
A reviewed pull request.

The session view below shows a real orchestration run. Four agents working in parallel, a reviewer requesting revisions, and the final PR — all streamed token-by-token in your browser.

aiorch — localhost:1230/session/sess_8f2a1c LIVE

[session] ◆ payment-core refactor · 4 agents · round 2/3

12:26:32 [planner] ◆ decomposed task into 4 agents (3 dev + 1 docs)

12:26:33 [session] ◆ smart-routing: a01 → claude-sonnet · a02 → gpt-5.5 · a03 → claude-sonnet · a04 → kimi

12:26:41 [a01] tool: write_file(src/services/payment.ts) +142 / -38

12:27:02 [a02] tool: write_file(src/retry/backoff.ts) +86 / -0

12:27:18 [a03] tool: run_tests(pytest tests/test_retry.py)

12:27:22 ✓ 14 passed in 4.2s · coverage 94.2% (was 78.1%)

12:28:04 [reviewer] ⚡ round 1/3 · a02 revision requested — edge-case handling

12:28:47 [a02] ⟳ revised · null-safety + jitter bounds

12:29:12 [reviewer] ✓ round 2/3 · all agents approved

12:29:18 [session] ◆ merging branches → integration/payment-refactor

12:29:21 ◆ conflict resolved: src/services/payment.ts:142

12:29:34 ✓ integration tests · 38/38 passed

12:29:41 ⬆ PR opened · github.com/acme/platform/pull/847

12:29:41 ✓ session complete · 4 agents · 3m 09s · $0.84 (your keys)

Inside aiorch

An operator console,
not a chat box.

Every running session exposes the full agent fleet, live logs, model spend, and a complete event timeline. Built for engineers who merge the PR — not for demos.

localhost:1230/ aiorch.

Operator control

Sessions & pipelines

Live

Active

2 / 17

Pipelines

1 / 3

Agents

Today

$1.84

sess_8f2a1c Reviewing Refactor payment module with exponential backoff retry logic…

sess_a04b29 Merged Migrate REST v2 endpoints to GraphQL schema…

sess_d3e104 Failed Add structured audit logging with Loki exporter…

Dashboard — everything at a glance

/ · landing view

localhost:1230/session/sess_8f2a1c aiorch.

← Session Reviewing · round 2/3

payment-core refactor

Agent	Model	State	Spend
a01 payment-core	claude-sonnet-4	Reviewing	$0.24
a02 retry-backoff	gpt-5.5	Approved	$0.18
a03 tests	claude-sonnet-4	Running	$0.21
a04 api-docs	kimi-k2	Approved	$0.08

Session detail — fleet + live logs

/session/<id>

Local-first architecture

Your code never leaves your infrastructure.

aiorch is a docker image you run on your own machine, VM, or build server. The orchestrator, agents, git worktrees, and review loop all execute locally. The only traffic that ever leaves the container is the model API call — to the provider whose key you configured.

Zero telemetry on your code

No aiorch server ever sees your prompts, diffs, code, or metrics. The only outbound call beyond your configured model APIs is a one-time license check.

You control the egress

Configure which provider endpoints the container can reach. Point at self-hosted inference, OpenAI-compatible gateways, or local Ollama — the orchestrator doesn't care.

Runs anywhere docker runs

Laptop, bastion host, CI runner, on-prem cluster. Same image, same behavior. No SaaS dependency to audit, no data-processing agreement to sign.

Orchestrator

local

Agents × N

local

Reviewer

local

Git worktrees

local

events.jsonl

local · append-only

Cost ledger

local

your source code
stays inside the container

model API call only

configured provider
claude · openai · kimi · codex · ollama

Bring your own key

Your provider contracts.
Your list price.

We don't resell tokens. We don't inflate margins on inference. Plug in the enterprise agreements you already have — or route agents to your own self-hosted models. Every session ends with an itemized cost report, agent by agent.

platform margin on tokens

100%

spend visibility per session

Anthropic Cloud

claude-opus-4 · claude-sonnet-4
claude-haiku-4

OpenAI Cloud

gpt-5.5 · gpt-5.4 · gpt-5
o4-mini · o3 reasoning

Moonshot Cloud

kimi-for-coding

Codex CLI Cloud

codex-default
OpenAI-backed agent

Ollama Self-host

qwen-coder · deepseek
any tool-capable model

OpenAI-compat Self-host

vLLM · TGI · LM Studio
bring your own gateway

How it works

Five stages, one command, one PR.

aiorch runs a deterministic orchestration pipeline. Each stage is logged and replayable — so you can walk back through any session end-to-end.

Decompose

Your task is parsed into isolated subtasks with explicit scope and acceptance criteria.

Fan out

Each subtask gets its own agent, git worktree, and model. Agents run in parallel, in isolation.

Review

A dedicated reviewer agent critiques each output, demanding revisions until the work meets spec.

Merge

Branches are merged into an integration branch. Conflicts are resolved programmatically or escalated.

Deliver

A single GitHub PR lands with summary, diff stats, agent breakdown, and complete event log.

Engineer impact

What one engineer actually saves.

Based on internal benchmarks across 100+ sessions at TravelCalls.ai — refactors, test backfill, and feature scaffolding compared to manual workflows. Your numbers will vary by codebase shape and task mix.

Senior-engineer time

~4 hrs

~20 min

Review cycles to merge

2–4 rounds

1 round

Inference cost

—

$0.50–$2 typical

Typical annual impact per engineer

5–10× faster on routine tasks

Based on internal benchmarks across 100+ aiorch sessions on a production Rust + React codebase. Inference billed at provider list price; aiorch licence purchased separately.

What engineers see first

Senior time back on senior work

Routine refactors, test backfill, and scaffolding stop consuming your hours. You go back to architecture, design, and review.

PRs ready by morning

Kick off a session before you log off. By standup, the PR is open, reviewed, and waiting on your approval — not your authorship.

Visible, bounded inference spend

Every session has a hard budget cap and an itemized cost ledger. Real provider costs, no markup, no vendor surprise invoices.

Audit logs

Every agent action, on disk.

Every session writes a structured event log to your local file system. Know exactly which agent wrote which line, which model was called, and what it cost — replayable from your own machine.

data/debug.log · append-only JSONL local · structured

02:14:06.112 orchestrator task decomposed (4 subtasks) —

02:14:48.201 agent-1 PaymentService.ts · patch applied · 214 loc $0.19

02:15:33.408 reviewer agent-2 revision · missing timeout bounds $0.04

02:16:02.115 agent-2 ExponentialBackoff.ts · revised · 88 loc $0.12

02:16:51.040 agent-3 38 tests authored · coverage 94.2% $0.31

02:17:29.778 reviewer round 2 · all agents approved $0.06

02:17:52.002 orchestrator PR #847 opened · integration/payment-refactor $0.12

Structured debug log

Every agent spawn, review round, merge, and completion is appended to an append-only JSONL log on disk. One line per event, grep-friendly. One-click diagnostic export bundles the full session into a single archive.

Real token costs, per agent

Cost is captured from the provider's API response where available, or calculated from published pricing for CLI-based providers. Roll up by agent, by session, or by pipeline.

Activity timeline in the UI

The session detail page renders the full agent timeline: who started when, what they touched, when the reviewer signed off. Same data as the JSONL — just rendered.

One-click diagnostic export

Export a complete bundle for any session or pipeline — events, prompts, diffs, model responses, costs — as a single archive. Drop it in a bug report or keep it for post-mortem.

Comparison

Where aiorch fits in the landscape.

Tools overlap, but they solve different problems. aiorch isn't competing for your IDE — it's competing for the review-and-merge loop.

Delivers a pull request

variessome do, some IDE-only

yes

yesreviewed by default

Multi-agent review loop

partialsingle-pass or add-on

partialsingle reviewer

nativemulti-round enforced

Runs on your infrastructure

partialagent may run cloud-side

cloud onlyor VPC at enterprise tier

docker, localfully self-hosted

BYO model / provider keys

limitedown ecosystem only

—bundled inference

any provider+ local Ollama

Token markup

seat + creditsvaries by vendor

compute-basedbundled pricing

zeroprovider list price

Cross-provider routing

—single vendor

any mixper agent, per session

Built in production

aiorch was built while shipping a Rust + React production AI platform.

We needed an orchestration tool that could deliver multi-phase refactors without supervision while we focused on customer-facing work. Nothing on the market did that the way we needed, so we built aiorch. Every feature on this page exists because we used it ourselves first.

— internal use case, not a paid endorsement

Pricing

Per seat. Your tokens.

aiorch is a Docker container that runs on your infrastructure. You pay per seat for the orchestrator. All inference is billed by your model provider at their list price, directly to you.

Free trial

Run aiorch for 14 days with full features. No card, no commitment.

Free14 days

all features · no card required

Install now

Full feature access
BYOK · all providers
Unlimited sessions & agents
Converts to paid after 14 days

Business

Team

Named-user licenses for your engineering team. Minimum 3 seats.

$39/ seat / month

billed monthly · minimum 3 seats

Seats

Total $117/ month

Buy seats

Unlimited concurrent sessions
Unlimited agents per task
BYOK · all providers · local models via Ollama
Cross-provider smart routing
Pipelines with audit phases
Diagnostic export & structured logs
Email support · priority response

Named-user licenses. Each seat is bound to one email and can be installed on unlimited devices by that user. No device limits, no concurrent session restrictions. See what gets logged →

Roadmap

What's coming, what isn't.

We list everything below — including what's not yet built. aiorch is local-first by design; if a feature would need a hosted service, we say so.

In progress

Signed event log

Hash-chained, sha-256-signed events.jsonl for tamper-evident replay and post-mortem. Today the log is plain JSONL on your file system.

target · next minor

Planned

Policy & model allowlists

Declarative YAML to constrain which models, repos, and branches a session may touch — plus per-session spend caps enforced at the orchestrator. Useful even for a solo user; not yet shipped.

target · Q3

Considering

Multi-user / shared history

SSO, role-based access, and a shared session history would require a hosted backend. aiorch is single-user by design today; we'd only build this if there's clear demand for a team edition.

no commitment

Most coding tools give you code. We give you a reviewed PR.

Generates a diff. Hands it to you.

Generates, reviews, revises, merges — then opens the PR.

Structurally impossible forClaude Code or Codex.

Why autonomous works here.Three review layers, not one.

01 Reviewer agent in-loop

02 Independent external auditor

03 Integration verification

Multi-phase work,not just single tasks.

One task. Four agents.A reviewed pull request.

An operator console,not a chat box.

Sessions & pipelines

Dashboard — everything at a glance

payment-core refactor

Session detail — fleet + live logs

Your code never leaves your infrastructure.

Zero telemetry on your code

You control the egress

Runs anywhere docker runs

Your provider contracts. Your list price.

Five stages, one command, one PR.

Decompose

Fan out

Review

Merge

Deliver

What one engineer actually saves.

What engineers see first

Senior time back on senior work

PRs ready by morning

Visible, bounded inference spend

Every agent action, on disk.

Structured debug log

Real token costs, per agent

Activity timeline in the UI

One-click diagnostic export

Where aiorch fits in the landscape.

Per seat. Your tokens.

What's coming, what isn't.

Signed event log

Policy & model allowlists

Multi-user / shared history

Wake up to a PR already reviewed.

Most coding tools give you code.
We give you a reviewed PR.

Structurally impossible for
Claude Code or Codex.

Why autonomous works here.
Three review layers, not one.

Multi-phase work,
not just single tasks.

One task. Four agents.
A reviewed pull request.

An operator console,
not a chat box.

Your provider contracts.
Your list price.