BLOG · AI & AGENTS
AI & AGENTS.
Engineering notes on AI agents, automation and the systems behind them — what works in production, what breaks, and what we would build differently.

01
Shipping AI agents to production: what actually breaks
Demos are easy; production is where agents meet ambiguity, rate limits and angry edge cases. A field checklist from the last year of deployments.
JUN 2, 20261 MIN READ

02
Evals before vibes: measuring agent quality
You cannot improve what you do not measure, and "it feels smarter" is not a metric. How we build eval suites that catch regressions before users do.
MAY 18, 20261 MIN READ

03
Choosing an LLM stack: OpenAI, Claude or self-host
Provider choice is an engineering decision, not a brand preference. The trade-offs that actually matter when picking a model stack.
MAR 24, 20261 MIN READ