Pooja Sathe is the Director of Product management at Lenovo, driving innovation and excellence in the commercial AI PC category.

With the release of the iPhone in 2007, there was an explosion of apps for nearly everything we do, from booking flights to ordering food to personal training. The rise of GenAI over the past two to three years marks another pivotal moment in technology, with a surge of new startups and emerging apps in areas like content creation and conversational AI assistants. While most GenAI apps currently run on the cloud, there’s a significant opportunity to bring these workloads locally to PCs, tablets and smartphones.

If you haven’t been living under a rock, you’ve probably heard about AI PCs. These PCs are getting more compute power thanks to the combined horsepower of CPUs, more powerful GPUs and the new neural processing units (NPUs) integrated into the devices. This means we can run some GenAI workloads on PCs without relying on the cloud. So, if you’re an AI app developer or an enterprise IT owner responsible for AI projects, here are a few considerations to keep in mind when deciding where to run your AI.

1. Domain-Specific Targeted Use Cases

You don’t need a Tesla Cybertruck if all you care about is going from point A to B; every adventure has its perfect ride. Similarly, in the world of AI, not every task requires the most powerful, cloud-based models. For instance, a small legal firm looking to automate tasks like summarization and editing can use a purpose-built small language model trained on a specific domain. There’s no need for a massive LLM running on the cloud.

Instead, a well-configured AI PC or workstation can efficiently handle these domain-trained smaller models locally, making it the most effective solution. However, if you are looking at solutions that require processing vast amounts of data, or complex simulations, cloud-based AI is indispensable.

2. Balancing Latency And Compute Requirements

Running AI models locally on the device not only reduces dependency on cloud resources but also reduces latency. Think about it: When your query goes to the cloud, gets processed and then comes back to your device, there’s going to be some lag. In fact, latency with cloud computing can be 4-5 times higher than with edge computing.

Another big plus is that on-device computing can happen without an internet connection. This is especially crucial when you’re using an endpoint protection AI agent for your device fleet, as it can detect threats even when the PC is offline. While local AI processing ensures low latency and faster decision-making, it’s important to consider the computational requirements of the AI workload. For compute-intensive larger models at scale, it makes sense to leverage cloud AI.

3. Security And Privacy Concerns

According to the 2024 IDC CIO report that our company sponsored, cybersecurity and data privacy are the top challenges for GenAI adoption. This is due to evolving data privacy laws, rising cyberattacks, and the increasing need to protect sensitive data such as PII and company proprietary data.

As someone responsible for bringing AI capabilities to your workforce, you should consider your data’s foundational preparedness, the type of data you’re running, whether it’s for internal or external use and then decide between cloud and on-device inferencing. For example, if you’re a software developer working on a company’s proprietary code, using an AI app to debug can be a great productivity hack. However, it’s better to avoid running it on the cloud. Instead, consider bringing it closer to the edge in a secure, locked-down environment.

4. Opportunity For Cost Savings

There are a lot of free GenAI apps, but subscription models like ChatGPT Plus and M365 Copilot are starting to emerge. This shift is because companies are moving from the testing and feedback phase to monetizing their investments.

With models like GPT-4, which cost around $0.03 per 1,000 tokens for prompts and $0.06 for completions, this isn’t a sustainable model. Scaling this pricing across millions of queries and users will be a huge challenge, and end users will ultimately bear the cost burden. This is where AI devices can be beneficial by running some of the inferencing on the device itself.

5. Perhaps A More Sustainable Option

There’s no doubt that AI deployments come with environmental considerations. According to the International Energy Agency, a request made through ChatGPT consumes 10 times the electricity of a Google search. This is a concerning stat given the ever-increasing number of data centers running GenAI.

The solution lies in finding ways to run AI workloads more efficiently. For instance, NPUs integrated into PCs are designed to run AI at much lower power. It’s important that we compare how much power a PC or smartphone consumes running AI workloads on an NPU versus high-performance CPUs and GPUs on the cloud. Where there is not enough data yet, it’s still a crucial consideration. Tech companies have the responsibility to make AI deployment more energy-efficient.

Conclusion

While it’s true that not all AI workloads can be brought to devices, it’s important to recognize the potential of high-performance AI PCs or workstations to run time-sensitive tasks locally in an efficient and secure manner. I envision a future where we find smarter ways to run AI in a hybrid manner, leveraging a combination of cloud and on-device processing.

A great example can be financial institutions using on-device AI on PCs to monitor transactions in real time, flagging suspicious activities immediately. The cloud component can handle complex and resource-intensive tasks, such as training fraud detection models on vast amounts of historical transaction data from multiple sources. This hybrid approach leverages the strengths of both on-device and cloud AI to create a robust fraud detection system in the financial sector.

There are multiple examples where we can successfully leverage local computing in conjunction with the cloud. I encourage AI architects and developers to consider the power and benefits of bringing AI inferencing closer to the devices.

The views expressed in this article are the opinions of the author and do not constitute an official company statement.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Share.

Leave A Reply

Exit mobile version