insight

What I learned shipping agentic systems in production

18 April 20262 min read

Editorial collage on a warm neutral background: a vintage neon theatre marquee reading Agentic System Live, with a robot seated on a bench and a person working at an old computer.

Notes from eighteen months building agentic AI for a multi-brand operating business — what worked, what didn't, and where the wall actually is.

Most agentic AI write-ups are demos. This isn't. The systems below have been live, in production, inside an operating business with real customers and real revenue, for over a year.

What we shipped

The useful systems were not the cinematic ones. They were narrow, unglamorous workflows with a clear owner and a measurable hand-off.

A content production pipeline that turns a brief, source material, and product context into a first draft for review.
A pricing and margin workflow that spots anomalies across a multi-brand catalogue before they become expensive habits.
A sales-research agent that builds, enriches, deduplicates, and scores target lists before a human decides what to trust.
Internal copilots that answer operational questions from approved documents and system data, with clear source links and escalation paths.

Retro collage: an assembly line where a typist hands work to a robot and on to a person reviewing with a checkmark — narrow agentic workflows with clear hand-offs. — The systems that stuck were narrow workflows with a clear owner and a measurable hand-off.

The pattern was consistent: let the model draft, classify, retrieve, compare, and explain. Keep humans in charge of approval, customer promises, price changes, and anything that touches money or reputation.

What worked

The systems that stuck had small surfaces. One team, one workflow, one definition of success. They did not ask everyone in the business to change how they worked on day one.

Retro collage: source documents feeding confidence-scored cards and a low–medium–high confidence gauge into a drafted, human-checked response. — Sources, scored queues and confidence ratings moved adoption more than model choice.

The second thing that worked was treating retrieval and tools as product features, not plumbing. If the model could not show where an answer came from, or which system it had checked, the user did not trust it. The best interface was often not a chat box. Sometimes it was a scored queue, a draft response, a suggested price review, or a "check this before it goes out" panel.

The third thing was confidence scoring. A busy team does not need AI to sound clever. It needs to know which items can be skimmed, which need careful review, and which should be routed to a person immediately. That changed adoption more than model choice.

What didn't

The broad assistant idea did not work well. "Ask anything about the business" sounds attractive, but in practice it creates fuzzy expectations and too many unsafe edge cases. The more useful move was to pick one painful workflow and make the model excellent inside that fence.

Retro collage: a robot tangled in a web of cables linking filing cabinets, an old register and dashboards, with question marks and a warning sign — messy underlying data. — Agentic AI exposes the state of the business underneath it.

Autonomy also hit a ceiling quickly. For customer-facing and commercial work, the right posture was draft, recommend, and explain. Fully autonomous action looked efficient in a demo and fragile in production. The first time a system sends the wrong answer to a customer, the time saved across the previous week stops mattering.

Data quality was the other constraint. The model could handle messy language. It could not magically fix missing ownership, stale product records, undocumented exceptions, or three teams using the same field differently. Agentic AI exposes the state of the business underneath it.

The wall

The hard part isn't the agent. The hard part is the data, the observability, and the change management around the agent. If you don't have those three, the agent is a demo, not a system.

Retro collage: a robot and a person facing a towering wall built of data dashboards, an org chart and stacked documents. — The wall isn't the agent — it's the data, observability and change management around it.

More on each of those — and how to build them without a six-month detour — in upcoming posts.

Get Actionable AI in your inbox.

One practical AI play per issue. Sent occasionally, never filler.

All insights

Ready to use AI seriously?

A 30-minute call. No deck, no follow-up nurture sequence. I'll tell you whether I can help.

Book a 30-min call Or send a message