Pete Hanlon, CTO of Moneypenny. Moneypenny handles outsourced phone calls, live chat and digital comms for thousands of companies globally.
Every company deploying an AI voice agent has made the same bet, whether they realize it or not. They’ve let the model design their customer conversation.
The goals are clear: collect the caller’s name, understand their enquiry and route them to the right team. The agent achieves all three. Metrics look healthy. But ask anyone in the organization to explain how the conversation actually unfolds—why the AI asks what it asks, in what order, with what tone and what happens when the caller says something unexpected—and nobody can tell you. Because nobody decided. The model did.
The Shortcut Most Companies Take
This isn’t a bug; it’s a shortcut. It’s giving a large language model (LLM) a set of goals and letting it figure out how to get there. This is fast to build, easy to demo and avoids the genuinely hard work of designing the conversation. The LLM decides the order of questions, the phrasing and the recovery when something goes wrong. It improvises your customer experience in real time.
The result is an organizational gap that almost nobody talks about. Product teams define what the agent should achieve and engineering teams build the systems, but the actual design of the conversation—the structure, tone, sequencing and recovery logic that sits between the goal and the language model—belongs to neither. The model has filled the vacuum by default.
Every other customer-facing discipline has dedicated ownership. Brand owns visual identity, product owns the interface and marketing owns messaging. But the conversation, the thing the customer experiences most directly, has been left to improvisation.
Why Locking It Down Doesn’t Work Either
The obvious response is to lock it all down—define every question, branch and response. But that creates the opposite problem. You end up with a rigid script that sounds like a classic interactive voice response (IVR) menu with better grammar: “Press 1 for sales, press 2 for support…”
The model adds no real value because you’ve constrained it to the point where it can’t do what it’s actually good at, which is making conversations feel natural, handling the unexpected and responding like a human being rather than a flowchart.
The Conversation Control Layer
The real challenge is finding the sweet spot between those two extremes. Too much flexibility and you can’t guarantee the agent has done its job. Too little and you’ve built expensive automation that callers hate. What’s needed is what I’d call the conversation control layer.
The companies getting this right are the ones who’ve actually built one, working out which parts of the conversation must be controlled and which can be left to the model. Working with customer conversations over many years has shaped a clear view for me of where structure matters and where flexibility creates value. That experience has informed how we should think about finding the balance between control and natural conversation.
What Should Be Controlled (And What Shouldn’t)
Here’s how I believe you should think about that boundary. The things that need to be deterministic are the things that need to be auditable. Did the agent collect the information it was supposed to collect? Did it follow the correct routing logic? Did it stay within the guardrails? These are not questions you want answered by probability. They need to be governed by code that is testable, predictable and auditable in real time.
The model’s job is everything else: taking a caller’s stumbling explanation and understanding what they actually need, responding in a tone that matches the moment and handling the unexpected phrasing, the half-finished sentence and the caller who changes their mind mid-thought. That’s where language models are genuinely brilliant.
But here’s what most teams miss. The structure doesn’t just constrain the model; it improves it. An LLM that knows where it is in a conversation (what’s already been covered and what still needs to happen) produces sharper, more relevant responses than one that’s improvising toward an open goal. Grounding the model in a defined flow means it can focus entirely on language instead of splitting effort between deciding what to do and how to say it.
Without a control layer, you get agents that sound natural but can’t be audited/trusted. We’ve seen agents confirm appointments before collecting contact details, offer refunds that don’t exist and cheerfully route callers to teams that closed hours ago. Nobody can diagnose why it went wrong because nobody designed the flow. You can’t quality assure a conversation that’s different every time.
Getting It Right
The companies that are successful with voice AI aren’t the ones with the best models or even the most powerful ones. They’re the ones that have stopped outsourcing the conversation to the model and started designing it deliberately. They’ve found the balance between control and freedom, they design the parts that matter for compliance and auditability, and they free the model to do what no script ever could.
This means drawing a clear boundary between what the system controls and what the model generates. The structure (sequencing, routing, guardrails and business logic) of my company’s AI, for example, is deterministic and designed. When the LLM operates within that structure, it can bring natural language flexibility without improvising along the way.
It’s critical to be able to prove, in real time, that the agent has done what it was supposed to do without the caller ever feeling like they’re talking to a system. When you design the conversation deliberately, you don’t lose flexibility. You gain control, consistency and trust.
Ask your AI vendor one question. Where is the boundary between the model and the code? If they can’t show you what’s deterministic and what’s generated—if they can’t prove the agent completed every required step in a conversation—you haven’t automated your customer experience. You’ve outsourced it.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?


