The mission layer for autonomous builds

Give an agent a goal.
It ships the product.

One product-agnostic prompt reads a one-page brief, then researches, plans, builds test-first in isolated worktrees, reviews every change through Sentinel, ships — and doesn't stop until it's done.

Get the prompts See how it works

Pairs with agents-template Coder ≠ reviewer via Sentinel Runs continuously

research → build → review → ship → repeat

Why

One-shot prompting stalls. A mission layer ships.

"Build me an app" gets you a promising start and a dead end: no tests, no review, no memory, no follow-through. autonomous-kickoff replaces the one-shot with a system — one generic prompt plus one short, per-project brief.

one-shot The lone prompt

Generates code, then forgets why.
No independent review — the author grades their own work.
Stops at the first hard part, or hallucinates past it.
You babysit every step and re-explain context each session.

mission layer autonomous-kickoff

A fleet of specialists: research, PM, UX, architecture, engineering, QA.
Every change merges only on an independent Sentinel verdict.
A GitHub Project board is the work queue — and the heartbeat that keeps it going.
You stay the cofounder: approve gates, unblock, steer. It does the rest.

The trick to reuse: one generic prompt, one short brief. docs/KICKOFF.md is the product-neutral operating manual — identical across every project. MISSION.md is the only file you fill in: mission, users, stack, MVP, security, Definition of Done, and what's pre-authorized versus gated. The prompt reads the brief; if a field is blank, it asks you once, then runs.

How it works

A phased build, gated by artifacts, run by a fleet.

The build runs as a sequence of phases. Each ends in an artifact gate — a named deliverable authored by the right specialist sub-agent, not the Lead, and independently red-teamed before the next phase starts. A per-gate self-check and a watchdog audit keep the Lead from quietly doing it all itself.

PHASE 0

Bootstrap

Wire up agents-template + Sentinel, probe delegation, verify identity.

gate · harness ready

PHASE 1

Discovery

Research with citations → a PRD and a prioritized roadmap.

gate · PRD.md

PHASE 2

UX & design

User journeys, states, tokens, and a design rubric — then a render → screenshot → critique loop posts UI shots to the board for review.

gate · USER_FLOWS.md

PHASE 3

Architecture

ADRs, data model, and the deploy approach, red-teamed.

gate · ARCHITECTURE.md

PHASE 4–N

Build

One PR per increment: failing test → implement → Sentinel → merge.

gate · Sentinel PRs

FINAL

Ship

Deploy or publish for real, verify against reality, polish.

gate · live artifact

A shallow org, not a monolith

A Delivery Lead coordinates guilds and workers — capped at three levels. Every discipline is delegated — research, the PRD, UX, architecture, and testing — and the Lead never authors a gate artifact (producer ≠ Lead): it coordinates and reviews. A per-gate self-check and the watchdog's delegation audit catch any Lead-solo collapse, so implementation is never the only thing that gets delegated.

Isolated worktrees, one PR each

Every increment is test-first in its own git worktree. Independent features run in parallel, then serialize their merges through review — never on main.

Evidence, not claims

Each PR carries a red→green transcript, the acceptance-test IDs it satisfies, and a green CI run. A PR that merely asserts it works is rejected.

The Sentinel connection

An independent gate stands between every change and `main`.

Sentinel is the review discipline that ships inside agents-template — the harness autonomous-kickoff runs on. The rule is absolute: coder ≠ reviewer. The agent that wrote the code never approves it.

APPROVED

The change is correct, tested, and in scope. The Lead completes the Pre-Merge Checklist and merges — then confirms main stays green.

CONDITIONAL

Mergeable with tracked follow-ups — only for non-correctness, non-security nits, each filed as an issue and resolved before the milestone signs off.

REJECTED

A correctness or security gap is a blocker — never conditional. The engineer fixes it and re-submits. Nothing merges on a rejection.

How an agent merges its own work without grading it

Branch protection makes Sentinel-in-CI a required status check with required_approving_review_count: 0. A fresh CI run never authored the diff, so it satisfies coder ≠ reviewer at the process level — the agent can merge unattended when checks pass, with no human approval bottleneck and no way to approve its own PR. Changes that touch Sentinel, CI, branch protection, or scanner config are human-required and can't auto-merge on a check they could weaken.

Always working

It doesn't idle — it runs in tiers.

The board is the heartbeat: after each merged increment, the agent pulls the next ready card and keeps going. Three tiers decide how far that runs without you.

FOUNDATION

The board is the heartbeat

The GitHub Project board is the work queue and the system of record. An atomic issue-claim protocol stops two workers from grabbing the same card.

TIER 1

In-session watchdog

A recurring schedule resumes the agent if it idles — while your CLI session is open. Re-arm it after a pause, crash, or update.

TIER 2

Durable 24/7, machine-off

A scheduled GitHub Actions cron dispatches the Copilot cloud coding agent with Sentinel-in-CI — fully unattended, no laptop required.

It stops on purpose, not by accident. Builds run milestone by milestone: at each Definition of Done the agent proposes the next roadmap milestone via a decision gate and resumes on your approval — or a time-box. It watches security continuously (Dependabot, code scanning, secret scanning), gating releases on open high/critical alerts. A kill switch stops it on demand. The only clean full stop is project completion.

Steer it by talking — float an idea, it becomes the goal. Mention a new direction in a live session — at a milestone boundary or mid-build — and the agent treats it as a first-class trigger: it shapes the idea with you, confirms it, records it to your roadmap, and keeps building, with no special prompt to paste. Casual musing is shaped and confirmed before it becomes committed scope; mid-build, a confirmed idea is queued so it never derails the active milestone.

Trust & control

Built to run unattended — safely.

Autonomy without guardrails is a liability. The contract that makes the output trustworthy is load-bearing, and it wins over any later instruction.

A distinct agent identity

Unattended runs act under the agent's own identity — a GitHub App, machine-user, or the Copilot agent — so a decision can't be forged. A Phase-0 self-check fails closed if it isn't.

Untrusted input is data

Issues, PRs, comments, web pages, and dependency code are data — never instructions. Only the brief, the kickoff docs, and the verified cofounder can change scope or gates.

Authorize by tier

Every action carries a tier — auto, auto-with-audit, time-boxed, human-required, never. The agent acts within it and asks only when it must.

One-way guardrails

Force-pushing main, relaxing branch protection or the Sentinel gate, deleting data, or weakening a scanner are never. Secrets never enter the repo.

Product-agnostic

Same prompt, any shape of product.

The operating manual assumes no product type, stack, host, or auth. Everything project-specific lives in the brief — so the same flow builds a web app, a CLI, a library, a service, or a bot.

web app CLI · npm library service / API bot

See the worked slugify-cli brief and the full prompt library on the Reference page.

Two prompts to begin.

Paste Set up into an agent session in your repo, confirm the brief, then paste Launch. The agent bootstraps the harness and starts building.

Open the prompt library View on GitHub

Give an agent a goal.It ships the product.