Retrieval, guardrails, and human-in-the-loop patterns — plus three architectures I've shipped that stay grounded even when the model is unsure.
Hallucination is the wrong framing. The real question is: what does your agent do when it's uncertain? A model that is confidently wrong 5% of the time isn't a bug to patch — it's a design problem to engineer around.
Production AI agents fail gracefully. They cite sources, expose confidence, and escalate to humans at the right threshold — not as an afterthought, but as a core pattern. Here are the three layers that make that possible, and three architectures I've shipped using them.
Layer 1: retrieval, so the model isn't guessing
Most "hallucination" is the model answering from memory when it should be answering from your data. Retrieval (RAG) fixes the root cause: pull the relevant facts first, then ask the model to answer only from them. The quality of an AI feature is usually decided by retrieval quality, not the model — it's the single highest-leverage decision in the shape of an AI architect.
Layer 2: guardrails, so wrong answers don't escape
Retrieval reduces errors; it doesn't eliminate them. Guardrails catch what slips through: output validation, "answer only from context" instructions, citation requirements, and a refusal path when confidence is low. A system that can say "I don't know" is more trustworthy than one that always answers.
Layer 3: human-in-the-loop, so stakes match oversight
The higher the stakes, the more a human belongs in the loop. The design decision is the threshold: when does the agent act alone, and when does it ask? Get that line right and you get speed where it's safe and caution where it counts.
Architecture 1: citation-grounded RAG (legal research)
Every answer links to its source passages. If the system can't cite, it doesn't answer. Lawyers trust it because they verify in one click, and the citations make wrong answers obvious instead of dangerous.
Get the build log
One email a month with what I shipped, what broke, and what I learned. No spam, unsubscribe in one click.

I ship production AI for startups and teams — agents, RAG, automations — on a decade of design & Webflow craft.
About me →Keep going.
AI agency vs. in-house vs. fractional: how to staff your AI work
The real trade-offs between hiring an AI agency, building an in-house team, and bringing in a fractional AI lead — and which fits your stage.
How to add AI to your SaaS (without a rebuild)
A practical sequence for shipping your first real AI feature into an existing product — what to build first, what to skip, and how not to break what already works.
