principles.

The operating rules I actually work by, drawn from the last two years of building systems that had to survive contact with real users. Not a manifesto. I revise these when I’m wrong.

Building

Start ugly. Get it working first.

The simplest version that actually runs beats the elegant version that's still in my head. I ship a working ugly thing, then make it beautiful once I know what beautiful should look like. Most of the time, what I thought was the final design stops making sense once real data hits it.

Know the inputs. Know the outputs. The middle can be a black box.

Before I write anything, I pin down exactly what goes in and exactly what comes out. If those are defined, the middle can be figured out as I go, and sometimes the middle gets replaced entirely without anyone noticing. If they're not defined, the middle ends up being what I argue about with the client three weeks later.

Don't over-plan. Start.

I've never once executed a plan exactly the way I wrote it. The plan is useful for setting direction, not for tracking progress. Strategic planning beats exhaustive planning. The quicker I get something real in front of me, the quicker I find out what I was wrong about.

Automations

Break workflows into pieces. Webhooks between them.

Monolithic workflows are where debugging goes to die. I split every automation into discrete functional units connected by webhooks. Each piece does one thing, each piece can be tested in isolation, and when something breaks at 2am I know exactly which piece to look at. It takes longer to design. It saves weeks in maintenance.

Define scope properly, upfront.

The number one way automation projects go bad is scope creep, and the number one way scope creep happens is ambiguity in the first conversation. I over-specify deliverables on purpose. It feels slow in week one. It saves the whole project by week six.

AI and agents

LLMs are non-deterministic. Use them only when you need to.

The more LLMs in a system, the less control I have over its behavior. Every LLM call is a gamble on output quality, latency, and cost. I default to deterministic logic and reach for an LLM only when the task genuinely requires reasoning that can't be codified. A regex beats a prompt whenever a regex is possible.

Voice agents live or die in the first three weeks after deployment.

Real user behavior never matches test scripts. People interrupt. They mumble. They get angry. They ask things nobody wrote a handler for. The first three weeks of production are where the agent becomes actually good, and it only happens if someone is watching call transcripts daily, catching failure modes, and retuning. A voice agent deployed and walked away from is a voice agent that will embarrass someone.

Cost is a first-class constraint.

Every LLM call is a dollar figure. A system that works but costs $40 per active user per month is not a system that works. I budget for inference the same way I budget for infrastructure, and I design around cost from day one rather than discovering it at the pricing conversation.

Shipping

Micro-commits. Descriptive messages.

When something breaks in production, the only thing between me and a bad night is the git log. Tiny commits, honest messages, each one reviewable on its own. fix stuff is a message I will regret. fix: retry logic on webhook timeout returning undefined is the message that saves me an hour at 3am. Every commit is written for the person debugging it six months from now, which is usually me.

Production is the only real test.

Staging environments lie. Test data lies. My laptop lies. The only environment that tells me the truth about whether something works is production, under real load, with real users doing real things I didn't anticipate. This is why I ship small, ship often, and monitor constantly. Everything else is rehearsal.