Why Generic AI Agents Don’t Work In Regulated Industries

Arun Ramakrishnan is co-founder and CTO of LogicFlo AI.

Not long ago, when people interacted with AI systems, they were experiencing them as chatbots. You asked a question, got an answer, maybe asked a follow-up and that was the end of the interaction.

Today, we’re seeing more and more AI agents. Everybody’s building them. Coding agents. Research agents. Writing agents. Agents that can browse, summarize, plan, retrieve information, call tools and then continue looping through tasks on their own. Most conversations are still centered around the model itself. Which model are you using? GPT? Claude? Open-source? What benchmark does it hit?

But the better question is: How are your agents making decisions?

There’s a phrase people use sometimes when talking about large language models: “stochastic parrot.” Agents simply predict likely next outputs based on patterns they’ve seen before. That’s what makes them powerful, but it’s also what makes them dangerous in high-stakes environments. The model itself is not deterministic. It’s generating answers based on probabilities.

Which is fine if you’re asking it to brainstorm headlines or help write code.

It becomes a very different conversation when the system is operating inside highly regulated industries, such as pharma, life sciences, finance or government, where every action may need to be audited later.

The Harness Is The Real Product

Usually, when people say “agent,” what they really mean is the AI model in the middle. Think of the model as the agent’s brain. The harnesses are the limbs. They do the work the brain tells them to do. They make the agent trustworthy and auditable inside a high-stakes workflow.

Broadly, I break the harnesses down into five components:

1. Input And Output Flow: This is how information enters the system and how outputs leave it. In regulated environments, this becomes important very quickly because agents can’t just freely pull from or write into systems without constraints. There are strict rules around what data can be touched, reused, surfaced or transformed downstream.

2. Tools: These are the actions the system is allowed to perform. Can it retrieve documents? Search internal systems? Write into files? Reuse approved content? This harness defines the boundaries around those actions.

3. Memory: This harness manages the agent’s ability to store and retrieve past interactions, summarized context or user preferences. The moment summarized information gets stored, it effectively becomes another regulated repository with retention, access control and audit requirements attached to it.

4. Decision Loop: This is the operational engine of the agent itself. A chatbot answers once and stops. Agents continuously loop through tasks. They plan, take an action, observe the result and then decide what to do next.

5. Safety And Guardrails: These are the constraints that prevent undesired actions. They define what systems the agent can access, what information it can retain, what tools it can call and whether it stays inside operational and regulatory boundaries.

All of this becomes much more concrete once you look at how regulated industries actually operate. In life sciences, for example, the path matters just as much as the result. You need lineage and auditability. Who initiated the workflow, what tools were accessed, what information the agent touched and whether every step can be reconstructed later if somebody asks questions.

In software engineering environments, people mostly care about the output. Did the code work? Did the task finish? Very few people are tracing every intermediate step the agent took to arrive there.

Beyond ‘Generate Text’

One thing that I think gets lost in a lot of AI conversations is that regulated industries already have deeply structured ways of working. Pharma is a good example of this.

Inside pharmaceutical organizations, for example, there are entire review processes built around medical, legal and regulatory review. Content gets approved ahead of time and then stored as reusable material.

The reason for that is pretty practical. If a team already knows a specific statement or chart has been approved, they can reuse it later instead of restarting the review process from scratch every single time. The content becomes modular, like puzzle pieces that can be reused.

Outside regulated industries, an AI system might just generate a paragraph from scratch, and nobody really cares how it got there as long as the output sounds good enough. In pharma, every claim needs support, every statement may need traceability and every output eventually flows into another layer of review.

A lot of generic AI tooling is optimized for speed and broad usability, but regulated industries optimize for something completely different. We’re not looking to generate content faster; we want to generate content that can move through review faster.

That’s why a lot of generic agent frameworks don’t map cleanly into regulated industries. The workflow itself is different. The output has to be supportable, reviewable and grounded in information that can actually survive scrutiny later.

Containing The Parrot

You’re never going to make a model deterministic. LLMs are probabilistic systems. That’s literally the reason they’re useful in the first place. They can adapt, generalize, handle ambiguity and connect things in ways that rigid systems usually can’t.

Problems happen when people take that probabilistic core and drop it directly into regulated workflows without enough structure around it. Because now you have a system making decisions, retrieving information, calling tools, storing memory, generating outputs, maybe even triggering downstream actions. Somewhere, later, somebody is going to ask: “Okay, but how exactly did this happen?” A fluent answer by itself doesn’t help much if nobody can verify where it came from.

And honestly, I think that’s where this whole space is headed. People are still obsessing over the model wars, but for regulated industries, the harness layer is probably going to matter more. You need to contain the stochastic parrot.

The companies that succeed here will be the ones that can reliably orchestrate these systems inside real-world constraints without losing the advantages that make the models powerful in the first place.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?