John Ottman opines on cloud data management, blockchain, cybersecurity & internet freedom. Chairman, Solix Technologies, Inc. & Minds, Inc.
Everyone remembers the first time they experienced generative AI (GenAI). According to OpenAI, ChatGPT reached 1 million users just five days after its launch in November 2022. What a spectacular technology solution to improve lives and gain productivity.
While GenAI is a game changer for just about anyone, what about GenAI for work? How do knowledge workers and employees use GenAI to blow through productivity goals and solve problems better, faster and cheaper?
A race is now on to enable powerful GenAI solutions for the enterprise. The goal is enterprise intelligence, which is achieved when GenAI applications are available to all employees to help them improve their job performance.
How much productivity gain is possible from enterprise intelligence? One example is code generation. SQL and Python code generation is such a powerful solution that Goldman Sachs has deployed GenAI developer solutions for all of its 12,000 programmers and expects 20% efficiency gained, according to CIO Marco Argenti.
Retrieval-augmented generation (RAG) is another breakthrough GenAI solution. RAG solutions use large language models (LLMs) trained with your enterprise data to provide business-accurate LLM responses. Imagine the productivity gained if every employee was equipped with GenAI tools to do their job better. This means writing any document about anything using data and context from your business, assisting with the medical prescription process, drafting a legal brief, predicting business outcomes and adding a bar chart in seconds, searching databases while you wait for just what you need to know, and solving customer problems faster with chatbots.
However, two years after the launch of ChatGPT, few companies have made substantial progress in introducing GenAI solutions to the enterprise. Many companies don’t have the infrastructure in place or the skills needed to power a production enterprise AI program.
The challenge is how to safely and securely ground GenAI models with enterprise data. Sensitive, personally identifiable information (PII) such as healthcare, credit card and other legally protected classifications of data is located everywhere and stored across the organization in vast data silos. Not only is enterprise data difficult to track, but the sheer volume of data continues to grow exponentially.
Compliance reporting is another challenge as new AI safety and security laws are being issued. For many organizations, security, risk and compliance challenges have forced enterprise intelligence to wait until AI safety and security can be assured.
The challenges facing enterprise AI implementations are so significant that Gartner Inc. has predicted a 30% project failure rate. A May 2024 McKinsey survey found that 70% of organizations with GenAI experience reported that data posed the greatest challenge to achieving value, especially regarding risk management and responsible AI. The problem may be even worse. Data governance concerns over pipelining enterprise data into “black box” LLM solutions have forced numerous Fortune 1000 firms to ban their use entirely over fears of data breach.
Despite these challenges, the rise of enterprise intelligence marches on. Data fabrics are one emerging strategy to support the compound requirements of enterprise AI. The journey for AI data starts at data collection with a data retention plan spanning years. Whether the source of data is an IoT device or an IBM mainframe, the collected data must first be classified and then featurized or otherwise prepared for use before it can be pipelined to a downstream data warehouse or AI application.
As data transits this complex data fabric, datasets often undergo multimodal transformations—possibly from files and tables in one format to index vectors in another. Still, data governance and compliance controls must be maintained throughout the data life cycle.
Enterprise architects look to common data platforms as the infrastructure foundation for enterprise AI data fabrics. Common data platforms are cloud-native software architectures that support best-of-breed, open-source components based on W3C standards. This open systems approach can enable broad integration without vendor lock-in. Common data platforms are the backbone of AI data fabrics, and they deliver the essential services for data collection, metadata management, data governance and data discovery.
Establishing an enterprise AI program office is another top priority. Data engineering skills are critical to delivering fresh, trusted, prepared data to power enterprise AI. MLOps and prompt engineers are needed to support GenAI, machine learning and data science operations. Of course, cloud ops and AI safety and security engineers are also critical.
Cloud data management applications organize historical data into archives and current data into data lakes not only to optimize infrastructure but also to properly stage the data for enterprise AI. Using third-generation data platforms supporting Parquet files, ACID transactions and open table formats such as Hudi, Delta and Iceberg, organizations are now able to leverage rich metadata and deploy strong data governance controls.
High-performance data pipelines that prepare data for use with GenAI must not only ingest, classify and prepare data at scale, but real-time incremental updates are needed to ensure data is fresh, updated and the highest quality. Powerful in-memory processing solutions like Apache Spark are critical to support the data preparation, data transformation and featurization processes that make enterprise data fit for use by AI applications.
Clearly, GenAI is here to stay, but enterprise data infrastructures have a lot of catching up to do before enterprise AI becomes ubiquitous. Without a robust data fabric, third-generation data platforms, powerful data pipelines and advanced data governance frameworks, high project failure rates may indeed be likely. The rise of enterprise intelligence requires cloud data management and new infrastructure solutions that deliver AI safety and security.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?