methodology note
How DW runs and passes are counted
DW is an autonomous, spec-driven development pipeline I designed and built. A short definition of the numbers so they mean something.
What counts as a run
A run is one specification submitted to the pipeline and executed end to end to a terminal state — planning, change, and verification — without a human taking over mid-way. Each run is logged, which is why the count is exact rather than estimated. Aborted or half-completed attempts are logged too; they simply don’t count as passes.
What counts as a pass
A pass is a run whose output satisfies the spec’s acceptance checks — it builds, the tests and evals it was asked to meet are green, and the result matches what the spec described — with no human edits to the produced change. A run that needs a person to finish or fix it is recorded as a non-pass, even if the final code eventually shipped.
Why the honesty matters
A pass rate is only worth quoting if a “pass” can’t quietly include work a human rescued. Counting human-touched runs as non-passes is the same discipline I bring to client agents: a green has to be earned and has to point at its evidence, or it isn’t a green. This figure is measured on my own system, not a client’s.