Dr. Michal Tzuchman-Katz, MD, CEO and Cofounder at Kahun Medical.
Generative AI’s success has revived age-old questions about the existential risk inherent in building intelligent machines, only this time with added urgency. Hundreds of scientists and business leaders (paywall) have voiced their concerns about the technology, while others have instead started implementing generative AI at full speed.
The truth is, it’s still way too early to calculate long-term existential risk. In the near term, we should be much more concerned about the prospect of entire industries adopting AI that isn’t up to par because it can’t really “reason” in ways that serve industry-specific use cases.
The Mechanics Of Generative AI
Generative AI chatbots like OpenAI’s ChatGPT so impressively mimic human conversation that we can sometimes forget they don’t actually think the way we do. This technology is impressive in many ways, but at the end of the day, we’re talking about a large language model (LLM) that determines its output by establishing relationships between words and predicting the most statistically likely response.
It’s misleading to say these models “fabricate,” for example, because that implies they do so intentionally. In reality, all that’s happening when they seemingly do something nefarious or come to an inaccurate conclusion is they’re selecting words erroneously because the statistical probability isn’t perfect.
The LLM accuracy problem is a feature, not a glitch. These models communicate so fluently only because they have the liberty to suggest output that is reasonably probable but not confirmed to be true. This output mechanism poses the most serious challenge to adopting generative AI beyond administrative uses in industries that can’t tolerate errors or hallucinations.
That’s a key point because it’s generative AI’s potential for professional support that sets it apart from previous innovations. For generative AI to really live up to its hype in the near term, it will have to be supported by additional layers of AI that do actually reason. It will have to do things like produce reliable clinical reports for doctors and functional floor plans for architects.
Generative AI alone can’t reliably do that.
The Challenges And Limitations Of Generative AI
ChatGPT in a medical setting, for example, would resemble a random college student who’s studying something totally unrelated to medicine but is interning at a local hospital. This intern can take in and summarize extensive amounts of information—and often gets things right—but they miss the medical context. A clinical summary produced by such an intern would create more work for the doctor reading it, not less.
Think of the lawyer who was fined $5,000 for submitting a brief littered with ChatGPT’s hallucinations in court. If a legal memo produced by ChatGPT still has to be reviewed by a lawyer word-by-word, and its references must be carefully validated to pass muster, are these tools really serving their purpose?
The problem isn’t only that LLMs make mistakes or lack the relevant context. It’s that the professionals using them don’t have a way to understand why they made a mistake or how they even arrived at their output beyond a basic understanding of how they function.
ChatGPT, for example, doesn’t trace back to the underlying sources used to determine each insight in its output—again, a feature, not a bug. Google’s Bard does cite sources, but it’s impossible for the end user to know how they factored into the model’s statistical probability-based response.
The Potential Of AI Assistants For Every Industry
The good news is these are fixable problems. There might not be a way to make generative AI models explain their output or get everything right, but we can bolster them with other transparent models that reason based on their industry’s gold standard of knowledge.
Imagine, for example, a legal assistant who consults with all the industry-accepted legal documents and draws conclusions about legal questions by reasoning the way a lawyer does. It could produce ready-to-use legal memos that explain to a lawyer exactly how it arrived at those conclusions by pointing to the relevant sources, all while leveraging ChatGPT’s conversational fluency. Such an AI assistant would never hallucinate previous rulings or studies because it’s trained to understand legal precedent and practice, not use statistics to fill in the blanks.
At that point, we’re talking about a document that was produced by an AI assistant that surpasses the legal expertise of the average human lawyer. Such a tool will still make mistakes. But as long as the lawyer using it can immediately see the logic that led to the error, they will still be able to build on that memo.
That’s the true potential of AI and one toward which every industry should stride.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?