INDEX / BLOG

Technical perspectives on AI systems

Writing about agent orchestration, trust layers, cost engineering, and lessons from building production AI systems.

KERNEL_PULSE

$ Deep-dives on autonomous AI agents, infrastructure, and cost engineering. No fluff, just systems.

June 23, 202611 min readarchitecture

The Arc Since the Six Were Caught

After fixing six agent failure modes, the system graduated from outcome checking to path checking. A field report on trajectory evals and generalization.

trajectory evaluationfalse green agent failuresautonomous agent reliabilityagent eval depth

June 17, 202610 min readarchitecture

The Calibration Ledger: 58 Runs, 93% Pass, n Is Small

58 runs, ~93% autonomous pass, and why that number is honest evidence — not a reliability proof. A field report on agent evals at solo scale.

agent calibration ledgerfalse green rateautonomous agent evalsfail-closed gate

June 12, 202612 min readarchitecture

False Greens: Three Structural Observations

How fail-closed design, headless execution seams, and converting incidents to fixtures stop AI agents lying about success. A field report from a solo dev-loop.

false green agent evalfail-closed gateheadless agent executionagent trust policy

June 8, 202610 min readarchitecture

One-Sided Contracts Break Agent Pipelines

A strict contract enforced on only one side produces false rejections as reliably as a loose one produces false acceptances. A field report from my fleet.

agent contract driftfalse green rejectionfail-closed gateagent output schema

June 3, 20269 min readarchitecture

QA Is Only as Honest as Its Coverage

A PR merged over a red CI job while dev-loop QA said green. Here's the root cause, the fix, and why partial greens are the most dangerous lies in agent systems.

false green CIfail-closed merge gateagent eval coveragecontract regression testing

May 30, 202610 min readarchitecture

Permissive Parsing Is a False-Green Factory

How a silent None from an AI agent registered as a pass — and the fail-closed contract that fixed it. A field report from a production agent fleet.

permissive parsingfalse greenfail-closed gateagent evals

May 26, 202616 min readarchitecture

Six False-Greens in a Self-Auditing Agent Pipeline

Six ways my autonomous dev-loop reported success when nothing had landed — and the fail-closed gates that caught each one before they corrupted the ledger.

false green detectionfail-closed gateagent evalsautonomous dev loop