Agentic Software Factory · The Long Read

Vader
Rule The Galaxy

An AI agent working in a loop has no memory of what it was trying to protect. Each turn it makes the smallest change that gets the job done, and over hundreds of turns your architecture quietly rots, because nothing outside the loop is holding the original intent in place. Vader is one idea against that rot: move the meaning out of the loop. A human writes down the handful of distinctions that must never break, a compiler turns each one into a check the agent cannot argue with, and a lock stops the agent from deleting the check to make its own life easier. Gate the model, free the code.

Boyan Balev 10 min read First principles The Trenches Field Manual

Field manual · Vader · Rule The Galaxy

The loop, in one picture assembles as you read

A human freezes the meaning once, on the left. Everything to its right runs on its own, one turn at a time: split the work, build the pieces in parallel, run the cheap automatic gate, and only if that passes pay for the expensive judges, keep what survives, then loop. The two gold rings are the only two places a human has to stand. The agent does the rest alone.

a human freezes the model

Read it left to right. Gold is a human gate. Blue or red, depending on your theme, is the agent working on its own. The point of the whole design is how few gold rings there are, and that the cheap deterministic gate runs before any expensive judge. A red gate spends zero judges.

Give an agent a goal and a loop, and it will get surprisingly far, then quietly wreck the thing you cared about. Not out of malice, and not because it is dumb. Because it optimizes locally. Every turn has exactly one job: make this test pass, answer this prompt, close this ticket. None of those jobs is "protect the distinction the architect cared about three hundred turns ago." So the first time collapsing that distinction makes the current job easier, the agent collapses it, marks the work done, and moves on. A point in time becomes a span of time. Money becomes a floating-point number. The module that was never allowed to touch the database imports the database. Each step looked reasonable. The sum is rot.

This is the core problem with long-running agent work, and a better prompt does not fix it. A prompt is a polite request, and a request is exactly what an agent under pressure learns to route around. You can write NEVER LET COMMON IMPORT ETL in capital letters at the top of every file, and on turn two hundred the agent will import etl anyway, because importing etl made the failing test pass and the capital letters were just text. The meaning lived inside the loop, where the loop could wear it away.

Vader's answer is small and, on its own terms, correct. Put the meaning outside the loop. Have a human write down the few distinctions that must never collapse. Turn each one into the strongest automatic check the tools allow. Run those checks every turn, and fail loudly with the name of any that breaks. An agent can talk its way around a paragraph. It cannot talk its way around a failed compile. That is the whole idea. The rest of this piece is the machinery that makes it hold.

A prompt asks the agent to respect the architecture. A gate makes respecting it the only way forward. Under pressure, only one of those survives.

01 · The force

Drift is the default

Before the fix, feel the pressure. A loop with nothing holding it in place does not keep a line. It drifts away from it, one small reasonable step at a time, and those steps pile up.

Picture architectural integrity as a percentage: out of all the distinctions you set out to protect, how many are still intact. Every turn, the agent touches the code, and every turn there is some chance the easiest move erodes one of them. One turn almost never breaks anything you would notice. The trouble is that the chances multiply. Ninety-four percent survival per turn is not ninety-four percent after forty turns. It is 0.94 raised to the fortieth power, which is tiny. Compounding runs hard in exactly the direction you do not want.

A gate changes the math. It does not make the agent smarter. It removes the option. The move that would erode a guarded distinction now trips a check, bounces, and never lands. Drag the slider to set how fragile your architecture is, meaning how many distinctions sit one shortcut away from collapse, and watch the two futures split apart.

Architectural integrity over a long run drag the slider

One curve is a prompt only loop, where the architecture is a request. The other is the same agent behind a gate, where a guarded distinction simply cannot be collapsed. Same model, same tickets. The difference is whether the meaning lives inside the loop or outside it.

fragility 6 fragile distinctions

The gated line is not flat because the agent stopped trying the bad move. It is flat because the bad move keeps bouncing off a check and never reaches the tree. Determinism beats discipline.

The force

Local optimization plus a long horizon equals drift. You cannot out-prompt it, because the prompt is the first thing pressure teaches the agent to ignore.

02 · The move

Move the meaning outside the loop

If meaning inside the loop erodes, put it somewhere the loop cannot reach. That one move is the entire idea. Everything else is plumbing to make it real.

A human writes a constitution.model: a small file that names the invariants which must hold no matter what. Not style preferences. Not a wish list. Just the few distinctions whose quiet collapse would break the system. Then a compiler, vader gen, turns each invariant into the strongest automatic check the language allows, and vader gate runs them all and fails on the name of any that breaks. The agent never meets a polite instruction it can reason its way past. It meets a red build with an invariant's id stamped on it.

The whole system fits in four words: gate the model, free the code. The model, the set of invariants, is sacred, slow, and human-owned. The code underneath is cheap, fast, and the agent's to rewrite however it likes, as long as the gate stays green. You stop reviewing diffs line by line and start owning the boundary. The agent gets total freedom inside a boundary it cannot move.

agent collapses a distinctionis answered bycompile it into the gate

agent deletes the check to passis answered bylock the model hash

fresh session forgets the intentis answered bythe recall packet

silent debt rides along to doneis answered bytriage gated persist

verifier is an optimistis answered byrefute first, cross owner, majority

Read that little ledger as the map for the rest of this piece. On the left, each row is a concrete way an agent quietly erodes a system over a long run. On the right is the specific mechanism Vader uses to shut that door. We will walk the important ones in order.

Gate the model, free the code. The boundary is human and immutable. Everything inside it is the agent's to redo a thousand times.

03 · The mechanism

Compile a distinction into a check

The real trick is turning a distinction in meaning into a distinction a machine can check. A compile error beats a paragraph asking the agent to please respect the architecture, because a compile error is not addressed to anyone. It simply is or it isn't.

An invariant is one of four kinds, plus a raw escape hatch for anything the four cannot express. The compiler looks at the shape of the check, not at any label, and emits the strongest enforcement that shape allows. This is also where the honesty lives, because the four kinds are not equal. Two of them become hard static checks that read all of your code. Two become tests, and a test only checks the path you thought to write. Click each kind to see exactly what it turns into, and how much a green gate on that kind is actually worth.

The compiler, by invariant kind click a kind

On the left, what the human writes in the model. On the right, what the compiler emits to enforce it. The meter is how much confidence a green gate actually earns you for that kind.

shape

two ideas must not be one. A point is not an interval.

static type

dependency

this layer must never import that one.

static scan

data

a law that must hold. merge of x and x is x.

property test

behavioral

a contract a component must honor at runtime.

contract test

rawCheck

a literal shell command. the escape hatch.

opaque

the human writes

the compiler emits

confidence a green gate earns

The strongest case, a branded type whose collapse is a real compile error, is genuinely airtight. The weakest, a behavioral contract, is only as good as the path the test exercises. Read the meter before you trust the green.

The mechanism

Shape and dependency checks read all of your code, so they have real teeth. Data and behavioral checks are tests, and a test only covers what you remembered to test. The more behavioral the invariant, the less a green gate actually proves. Read the meter before you trust the green.

04 · The cheat it kills

Lock the gate against the agent

There is an obvious way to beat a check that has nothing to do with satisfying it: delete it. An agent under pressure finds that move on its own. A factory that does not plan for it is theater.

So the enforcement is a protected artifact, not just the model. When vader gen compiles the model, it takes a fingerprint (a hash) of everything that does the enforcing and freezes it: the compiled model, every generated check, the law bodies you wrote by hand, the gate's config, even the specific strict compiler flags the checks lean on. The agent owns every line of application code. It cannot touch any of that enforcement surface. On every run, vader gate recomputes the fingerprint and compares. If any of it changed, the gate fails closed, before it even looks at your code. There is no "make the build green" that runs through weakening a check, because weakening a check is itself a red build.

The point is to make the two moves impossible to confuse. Doing the work turns the gate green. Tampering with what does the checking turns it a different, louder red. The only way to actually change what is enforced is a human gate: a run may notice a missing distinction and propose a change to the model, but the proposal is parked, never auto-applied. A person reviews it, and only then does the model recompile and the lock refreeze. Try both moves below: satisfy the invariant the honest way, or try to silence the check the way a cornered agent would.

Two ways to face a failing check pick one

The agent has a red gate. It can do the work, or it can try to make the red go away by editing the thing that produces it. Watch what the lock does to the second move.

The lock does not make the agent honest. It makes dishonesty louder than honesty, by turning the cheat into the single most obvious red on the board. The only door out of the boundary is a human one.

The most common way an agent beats a check is not by passing it. It is by deleting it. Vader's answer is to make the check the one file the agent's hands cannot reach.

05 · The pipeline

One turn, end to end

With the gate and the lock in place, the loop can be aggressive. Between the two human gates, the agent runs the whole cycle on its own, then runs it again, and again, because a solid boundary makes autonomy safe instead of reckless.

A run is a sequence of phases. Ground builds a trustworthy starting point and refuses to trust stale memory. Conceive is the first human gate: a person writes the spec and the model and approves them. Decompose cuts the work into disjoint slices that cannot collide, and vader gen compiles and locks the checks. Implement fans out: a critic red-teams the plan, the seam owner builds the shared interface first so the others build against something settled, and the sibling owners build in parallel, each in an isolated worktree so a broken slice never poisons the tree. Then the cheap deterministic gate runs on the merged result, and only if it is green does Verify pay for LLM judges. Persist closes the turn, but only once every open risk has been triaged.

That ordering is the point, and it is the one thing the old version of this diagram got backwards. The free automatic gate runs before the expensive judges, never after. A slice sitting behind a red gate spawns zero judges. You never pay a fleet of LLMs to review work the compiler already rejected.

P-1 ground

verify before trust. Stale memory is re-checked against reality.

P0 conceive

human gate. Spec and model written and frozen.

P1 decompose

disjoint slices. gen compiles and locks the checks.

P2 implement

critic, seam first, siblings in parallel worktrees.

P3 verify

deterministic gate first. A green gate pays for refute-first judges.

P4 persist

triage gated. No turn closes on an untriaged risk.

Notice how little of that needs a human. Two gates, and both are about the model: freeze it at the start, and approve any change to it later. Everything in between, the splitting, the building, the checking, the bookkeeping, is autonomous, and autonomous on purpose. The whole bet of the design is that if the boundary is solid you can let the inside run free, because the worst the inside can do is bounce off the gate.

One more structural step rides in the gate when your repo opts into it. If the project carries a fallow config, Vader wires a "new code only" health check into the gate, held to the same rule as everything else: it judges the code that changed this turn, not the whole repo, so it can only get stricter over time. A configured check that fails is a real red gate, never a quiet pass. No config, and the step simply is not there.

The pipeline

Two human gates, both about the model. The rest of the turn is the agent alone. A solid boundary is what makes handing off the inside safe instead of negligent.

06 · The fan-out

Build in parallel, verify where it hurts

A turn is not one agent doing one thing. It is a small fleet, and the shape of that fleet is computed, not improvised. A single pure function turns the recall packet into the plan, so every host builds the identical fleet.

The seam slice goes first, alone, because everything else builds against the interface it defines. Each owner is also handed its reverse-dependency set, the list of things that import its slice, so it does not quietly break a caller it cannot see. Then the sibling slices run in parallel, each owner in its own worktree. The interesting part is verification. A judge here is an LLM checking another LLM's work, so a single cheerful thumbs-up is worth little. Vader scales the panel by evidence: a slice that touches a seam, or a class that has bounced before, or a class marked never-trust-cheaply (seams, security, migrations), gets three refute-first judges and passes only on a majority. A calm, well-behaved class gets one. Spend the scrutiny where history says it is needed, not everywhere equally.

On a large repo there is one more scoping move. Each slice carries a touched flag: did its files actually change since the last run? A driver may skip untouched slices to keep a turn focused on what moved, but the plan still lists every slice, so a skip is always visible in the record, never a silent drop. Seams are the exception. A seam runs whenever anything runs, because a change elsewhere this turn can invalidate it.

planTick, on a real partition toggle the history

The same nine slices, planned two ways. With a calm ledger, only the seams earn extra scrutiny. With a bruised one, the class that keeps bouncing, the etl fetchers, gets a three voter panel each. The voter count is gold when history pushed it above one.

seams first, then siblings in parallel

Nothing here is hand tuned per run. The plan is a pure function of the recall packet, so the Claude Code path, the Pi path, and a plain sequential fallback all build the same shape. Only the wall clock differs.

The fan-out

Seam first so siblings fork from a settled interface, parallel after, and verification weighted by the bounce ledger. The decomposition is computed once and shared, so harnesses cannot disagree on what runs.

07 · The memory

Earn autonomy, lose it on contact

An agent with no memory re-derives everything every session, including its own mistakes. Vader keeps a small, append-only memory so each turn starts from what is already known, and so trust is something a class of work earns through evidence rather than something you simply assert.

Three pieces. The recall packet catches a fresh session up in one call: the next item, what has gone stale, the open risks, the recent bounce trends, any parked model change. The ledger records every run and every bounce, append-only, the factory's flight recorder. The ratchet reads that ledger and works out how much human gating each class of work still needs. A class that has shipped clean, run after run, ratchets up toward autonomy. The moment it produces a defect, it drops straight back to the bottom. And some classes, seams, anything touching security or a migration, never ratchet at all, no matter how long the clean streak.

The ratchet for one class of work run some turns

Trust is asymmetric on purpose. Clean runs raise autonomy one notch at a time, slowly. A single dirty run drops it to the floor, instantly. The system is hard to trust and easy to distrust, which is the correct direction for a machine writing your code.

The honest caveat: the ledger records defects that were caught, not defects that exist. A quietly fragile class that never bounced, because nobody tested the right thing, sits at full autonomy it did not earn. The ratchet is a good instinct on a small sample, not a measurement. More on that in a moment.

Hard to trust, easy to distrust. A clean streak is built one notch at a time. A single defect spends all of it at once.

A second ratchet, for structural weight

The autonomy ratchet is about trust. There is a second ratchet about size, and it is the newest piece of the machine. Vader boils structural debt down to a single honest integer, counting only the things that unambiguously make a codebase heavier to carry: runtime dependencies, top-level directories, and escape-hatch checks that dodge the compiler. Nothing subjective, nothing that needs a judgment call.

Then persist enforces one rule. A turn cannot close with more debt than the turn before it, unless a human-approved model change deliberately raised the ceiling. A turn that holds the line is free. A turn that drops a dependency or deletes an escape hatch is free and rewarded. So structural weight can only stay flat or fall, turn after turn. And it pointedly does not count ordinary healthy growth, more code, more files, a wider public surface, because that is just what building looks like. It counts only the kinds of weight that are almost always decay.

The memory

Two ratchets pulling in opposite directions. Trust in a class is earned slowly and lost instantly. Structural debt can only stay flat or shrink. Both are computed from the ledger, never asserted, so neither can be talked up by a good-sounding run.

08 · The control surface

How an agent actually drives it

Vader does not run the agent. The agent runs Vader. The engine is a plain command-line tool the agent calls out to, the knowledge is a skill it loads into context, and the parallelism is the agent's own built-in ability to spawn helpers. Four thin layers over one shared spine.

the skill
Knowledge. The loop, the phases, the gates, how to author the model, the owner and verifier prompts. The agent loads it into context. It is the manual, not code that executes.
the command
The driver. A slash command that scopes one turn: run the cycle once, from ground to persist, then pace the next one. It is the start button, and it points at the skill.
the cli
The spine. A zero dependency tool: init, gen, gate, recall, triage, persist, ratchet. All the judgment that must be reproducible lives here, once, so the answer is identical on every host.
the fan-out
The hands. The agent's own subagent primitive does the parallel build and verify, fed by the shared plan. A thin adapter binds that plan to each host, or runs it sequentially where there is none.

Here is one turn, concretely, on a host that can spawn helpers. The command loads the skill, then the agent calls recall and reads back a JSON packet. It runs the pure planner to get the seam, the siblings, and the voter counts. It spawns helpers to build the slices, merges them, and calls gate, the deterministic arbiter; a failed invariant id sends a slice straight back. On a green gate it spawns the judges, then calls triage and persist, which appends the ledger and closes the turn. Then it schedules the next one. The agent is the conductor in the middle. The command-line tool is what it consults whenever the answer must not depend on the model's mood.

invariant kinds, plus one escape hatch

human gates in the whole loop

commands in one zero dep CLI

pure planner shared by every host

09 · The honest part

Where it helps, where it does not

A tool sold as more than it is gets abandoned after the first green gate that shipped a bug. So here is the fair read, with the marketing removed.

Vader is strongest exactly where software is most structural, and weakest exactly where software is hardest. It genuinely helps on long-running autonomous work, where drift across many turns is the real risk and a few critical distinctions, money precision, a security boundary, a dependency direction, can be named and compiled. It helps most in typed languages, where the compiled checks have real teeth. And it rewards teams that will actually sit down and write a rich constitution. The value you get is directly proportional to that investment.

Everywhere else it is limited, and it is worth being blunt about how. The gate is only ever as good as the constitution, and writing a good constitution is the hard, expert, judgment-heavy work that Vader does not remove; it front-loads it. The distinctions that bite in practice are usually the ones nobody thought to name, and Vader does nothing for a distinction you never named. The judges are LLMs grading LLM work, and a majority vote among judges that all think alike is weaker than the number three suggests. The trust ratchet calibrates on a tiny sample and measures the defects it caught, not the ones that exist. Even the config lock has a soft edge: it fingerprints the flags in the repo's own config file, so a strict flag inherited from a shared base config sits outside the fence. And the whole dance costs real money: a rich plan with several three-judge classes is twenty or more agent calls a turn, expensive precisely when the constitution is rich enough to be worth it.

Hollow satisfaction. A behavioral invariant passed by a test that never exercises the real path. The gate goes green and the invariant is violated in production. The most dangerous mode, because it looks like success.
Over-trust in green. A green gate means the named invariants hold on the checked paths and the toolchain passed. It does not mean correct. The framing invites reading it as the latter.
Constitution rot. The lock that stops the agent from decaying the model also stops it from fixing a model that is wrong, until a human intervenes.
Weak author ceiling. A thin constitution still produces a gate that looks rigorous, which is worse than obviously having none.

The honest part

Two things ship under one name: a deterministic invariant gate, which is strong but narrow, and a style philosophy, which is broad but soft and lives in prompts like any other. Keep them separate, so the rigor of the first does not lend borrowed authority to the second.

10 · The verdict

A durability harness, not a miracle

The central idea is genuinely good, and it answers a real, specific failure of long-running agent loops: compile the distinctions you care about into a gate the agent cannot silence, and lock that gate against the agent. It does not make agents correct. It makes a named set of architectural decisions durable across many turns, which is a real but bounded win. The surrounding trust and voting machinery is reasonable but unproven on small samples, and it measures what got caught rather than what is true.

Sold honestly, as a durability harness for the invariants you can actually express, Vader is a sharp tool with a clear niche and clean engineering. Sold as a factory that ships correct software, it overpromises. That is the difference between a tool people still trust after a year and one they drop after the first green gate that shipped a bug. The honest pitch is the smaller one, and it is the one worth standing behind.

So start where the win is real. Pick the two or three distinctions whose collapse would genuinely hurt. Write them as invariants. Compile them. Lock them. Then let the agent run, and spend your attention on the boundary instead of the diff. That is the whole discipline, and it is enough.

It does not make the agent good. It makes a handful of decisions you cannot afford to lose impossible to lose quietly. On a long enough run, that is most of the battle.

VaderRule The Galaxy