case study · Industrial parts distributor

An agentic AI for customer support, trained on 30 years of correspondence

30 yrs

correspondence indexed

September 2025

ai-consulting · fractional-cio

Flat illustration of a small robot agent inside a hexagonal frame, with wrench, magnifying glass, and envelope tool icons orbiting it. A database, document folder, and chat-bubble flow in from the left via arrows; an analytics dashboard at the bottom feeds back into the hexagon. Warm tan background.

A production agentic system for an industrial parts distributor that drafts replies to customer enquiries — pulling from three decades of historical correspondence and live ERP stock and pricing data.

The headline

A long-established industrial parts distributor was fielding hundreds of customer enquiries a week — "do you still have this part for this machine, what's the price, when can you ship it." Every reply required digging through three decades of historical correspondence, checking the ERP for stock and price, and drafting a quote. I built an agentic system that does all of that automatically, then drafts a reply for human review and send.

Context

Hundreds of customer enquiries arriving each week, the majority of them legitimate "is this part still available" requests
Long-tail product catalogue — many parts are 20+ years old, original documentation patchy
Knowledge concentrated in long-tenure staff; succession risk if they leave
ERP holds current stock and pricing but no easy way to combine it with historical context
Reply time was the bottleneck on conversion — slow quotes lost deals to faster competitors

The opportunity

The historical correspondence wasn't a problem — it was an asset that wasn't being used. Decades of "yes we have it, here's the price, this fits these machines" sitting in email archives. The question was whether a model could read it, understand it, and combine it with live ERP data to draft a coherent reply.

Approach

1. Data foundation

Ingested and embedded the historical correspondence corpus — tens of thousands of emails spanning three decades
Built ERP connectors for live stock, pricing, lead-time, and customer history
Designed the retrieval layer: hybrid semantic plus keyword, with recency weighting so newer answers outrank decade-old ones when they conflict

2. Agent design

Multi-step agent: parse enquiry → retrieve historical context → check ERP → draft reply
Tool use: each step is a discrete tool the agent can call, observable in logs
Guardrails: the agent never sends — it always drafts for human review
Confidence scoring on each draft so reviewers can triage which to read carefully and which to send through

3. Integration

Drafts land in the existing inbox workflow as a regular draft, ready to send
Reviewer can edit before sending, with edits fed back as a training signal
Audit trail on every draft showing which sources the agent drew from — so a reviewer can verify the reasoning, not just the output

4. Rollout

Piloted inside a single business unit before expanding across the group
Light-touch training for the customer service team — what the agent is good at, what to watch for, how to flag a bad draft
The feedback loop matters more than the launch: reviewers' edits become the dataset that makes the next month's drafts better

Outcome

The bulk of routine enquiries now arrive in the inbox with a draft already attached, materially faster than the previous manual process
100% of replies still reviewed by a human before send — by design, not by accident
Long-tail product knowledge that previously lived in two people's heads is now retrievable by anyone on the team
Most drafts go out lightly edited; a minority are heavily edited; a small minority are rejected outright and rewritten from scratch. That distribution is the signal we tune against.

What we learned

"Draft, don't send" is the right ceiling for this class of work in this kind of business. Autonomy without review would have eroded customer trust the first time the model got it wrong, and it would have got it wrong.
The historical data was messier than it looked from the outside. Half the work was deciding which kinds of emails were worth indexing at all — quote correspondence yes, internal banter no.
Confidence scoring is the most useful single thing we added. It lets a busy reviewer trust the green ones and read the yellow ones carefully, which is the only way the system pays back on a real day's workload.

Stack

Model: Anthropic Claude as the primary reasoning model, with model selection tuned by enquiry type
Retrieval: Hybrid semantic plus keyword search, recency-weighted, over the historical correspondence corpus
Orchestration: Custom Python orchestration, with every tool call observable in the audit log
ERP integration: Microsoft Dynamics 365 Business Central via the native REST API
Hosting: Self-hosted on the group's existing cloud account; model calls routed through the provider's enterprise tier

This is exactly the kind of work the AI consulting practice exists to deliver — production agentic systems that sit alongside an existing ERP, not parallel to it. The Business Central integration leans on the same depth used in the multi-brand ERP consolidation.

All case studies

Ready to use AI seriously?

A 30-minute call. No deck, no follow-up nurture sequence. I'll tell you whether I can help.

Book a 30-min call Or send a message