In today’s column, I identify and explore a hot trend in the AI field that is variously referred to as Physical AI sometimes also known as Generative Physical AI (a mash-up of generative AI and a said-to-be additional physical AI capability).
The gist is that AI is becoming further data-trained on how to cope with the physical world, the real world in which we all live. This will allow AI to become much more useful and expand further into all manner of physical mechanisms and devices such as generalized robots, specialized automata, humanoids, autonomous vehicles, and just about any kind of AI-driven machine that can walk, crawl, run, slither, jump, grasp, and otherwise operate within the tangible confines of the real world.
Let’s talk about it.
This analysis of an innovative AI breakthrough is part of my ongoing Forbes column coverage on the latest in AI including identifying and explaining various impactful AI complexities (see the link here). For my coverage of how generative AI such as ChatGPT, Claude, Llama, Gemini, and other major AI is increasingly being connected to robotic arms and akin robotic capacities, see the link here.
Reality And The Physical World
Before we dive into the AI considerations, I’d like to offer crucial insights regarding the physical world and how humans manage to operate within it.
Here we go.
Envision that a baby or toddler tries to stand up, wobbles, and then topples over. What just happened? You might say that it was inconsequential and ordinary. Happens all the time, no big deal.
But you would be missing a hidden secret. The child is learning about the physical world. Forces of gravity are pulling downward on their body. They must use the muscles in their legs to push up against this mysterious unseen force. A center of balance must be established to remain upright. And so on.
Physical and physics lessons are taking place. Pretty much in the blink of an eye. Little does the child seemingly directly comprehend the complex underlying geometry and mathematics of it all. Instinct seems to solely be at play.
Later in life, during school and intense classes in physics and math, they will get a grand revealing of the explicitly calculated secrets underlying the physical world. Until then, it is mainly a sense of embodied presence in the real world, entailing physicality and nature’s intrinsic bodily dynamics, along with a spoonful of mindful geospatial mapping.
Humans and animals must discover the rules and laws of operating in a physical world to act, survive, and thrive. It is mainly learned by doing. Sure, a human can show another human how to jump in the air and do the splits, but it won’t especially sink in until the person being shown the demonstration attempts the physical act themselves.
Reality and physical movement are like the air we breathe; it is all around us and we conventionally take it for granted. Only when you suddenly slip on a sheet of ice or haphazardly trip on a banana peel do you find yourself jarred into the realization that you are immersed in a physical world that requires constant vigilance on how to move and exist in physical space.
More Physicality Comes To Mind
Here’s an intriguing twist for you.
An ongoing and longstanding debate exists about the duality of mind and body.
Could a mind that lacks a body be able to sufficiently cope with the real world if it never was conjoined with a physical body?
Sci-fi plot lines have often delved into this devilish riddle by having brains floating in vats and disconnected from an actual body. It is a puzzling question since we cannot undertake the enigma in any real sense. For the time being, brains must have bodies. They are two peas in a pod.
Maybe intellect is inextricably tied to the facets of having a body. It could be that thinking is not separable from our senses such as having ears, eyes, noses, taste, and touch. Without those bodily capabilities, it could be that a brain and mind would not ultimately formulate into a thinking capacity. A brain and mind might be an empty vessel without having had the experiences of sensory inputs from the likes of their body working in physical environments.
Whoa, that’s outlandish philosophical stuff, but what does it have to do with AI and the here and now?
Grab a glass of fine wine and continue reading.
AI And The Physical World
Shift gears into AI mode.
What does contemporary generative AI seem to “know” about the physical world?
I put the word “know” in quotes to emphasize that today’s AI is not sentient and doesn’t know things in the same manner that humans do. AI is essentially software running on computer hardware and generative AI is specifically all about pattern-matching computationally on human writing such as essays, poems, stories, and the like. If you’ve used ChatGPT or any of the popular LLMs then you are familiar with how amazingly fluent generative AI seems to be, though realize that the computational and mathematical effort is a form of mimicry regarding human natural language.
Well, let’s go ahead and ask ChatGPT a question involving the physical action of dropping a rubber ball and find out what the AI has to say.
- My entered prompt: “What will happen if I drop a rubber ball from shoulder height?”
- Generative AI response: “When you drop a rubber ball from shoulder height, it will fall due to gravity and hit the ground. Upon impact, the ball will compress slightly, storing energy from the collision. Because it’s rubber, it will then bounce back upward as it releases that energy. However, the bounce height will be lower than the original drop height because some energy is lost to heat, sound, and deformation during the process. The ball will continue bouncing, with each bounce getting progressively lower, until it eventually comes to rest.”
That is a reasonable answer and matches with what we would expect to happen.
I’m assuming that if I asked you the same question, your answer would be about the same. You see, since the days of being a baby and a toddler, you eventually figured out that dropping a rubber ball from shoulder height will fall to the ground and then bounce. You’ve seen it happen, maybe zillions of times during your lifetime.
AI Data Trained On The Physical World
So far, so good.
Next, I will ask ChatGPT how the answer that the AI gave was actually derived or determined.
- My entered prompt: “How do you know that the rubber ball will act as you’ve described? What is the basis for making your assertion?”
- Generative AI response: “I was trained on vast amounts of text, which includes scientific explanations, educational materials, and everyday observations. According to that data, gravity is a universal force that causes objects to accelerate toward the Earth’s surface when unsupported. This principle is foundational in physics and appears frequently in the training data.”
As shown in the response by ChatGPT, the method of figuring out what would happen once the rubber ball was let go consists of using the data training that the AI underwent when first being set up. The AI maker of ChatGPT, OpenAI, had scanned the Internet widely and used the various data on the Internet to establish patterns of how people write and describe things. In there, certainly, there would be plenty of content about physics and how physical objects in the real world move and act.
You might say that generative AI is all about a semblance of so-called book learning. But has generative AI observed a ball that falls and bounces? Was that part of how the AI was able to formulate a suitable response?
Let’s ask.
- My entered prompt: “Have you experienced this in terms of having dropped a rubber ball from shoulder height and observed the physical ramifications?”
- Generative AI response: “No, generative AI does not “experience” dropping a rubber ball or observe the physical ramifications firsthand because it lacks a physical body, sensory perception, or the ability to directly interact with the physical world. My knowledge is derived entirely from the textual data it has been trained on, which includes descriptions, explanations, and observations provided by humans who have experienced or studied such scenarios. AI does not “see,” “feel,” or “do” in the way humans or robots with sensors and actuators might. Instead, it simulates understanding by processing patterns and relationships from its training data.”
Aha, the AI is doing its calculations and responses based on data and not on any real-world first-hand experience.
I know that smarmy types will exhort that everyone knows that to be the case, but I dare suggest that it isn’t as widely understood as the cheeky cynics might declare. People are often unsure of what led to generative AI being able to respond. There is a tad bit of mystery at times and users just assume something magical is taking place.
Physical AI Is The Next Frontier
I’ve walked you through that slog to point out that one of the greatest limitations or weaknesses of the awe-inspiring rise of generative AI is that the AI doesn’t “know” the physical world in any meaningful way other than via having read about it.
This presents a problem, a totally huge problem.
If we are going to connect generative AI to all kinds of robots and other machines that are wandering around in our homes, offices, factories, streets, and the like, we ought to expect that the AI will do so properly, safely, and with aplomb.
Can an AI that only has text-based data training adequately control and direct those real-world machines as they mix among people?
Some assert that this is a highly dangerous concern. The generative AI uses ostensibly book learning to guess what will happen when a robot is instructed by the AI to lift a chair or hold aloft a dog. Is that good enough to cope with the myriad of aspects that can go wrong? Perhaps the AI will by text-basis logic assume that if the dog is dropped, it will bounce like a rubber ball. Ouch, the dog might not be amused.
The crux is that with the near-term and longer-term consisting of ubiquitous AI-driven machinery and robots, there are two major routes to getting generative AI up-to-speed on the physical world:
- (1) Generative AI physicality by words. Include more and more text-oriented content about how the real world operates and the nature of physics and physical actions into the pattern-matching data training of generative. Keep doing this to provide as immense a depth on the heady matter as feasible.
- (2) Generative AI physicality by deeds. Work on having generative AI be directly engaged in operating physical machinery such that the act of physical movement and manipulation is garnered by interaction and not by bookish approaches alone. Get the AI to then actively pattern-match on what the sensory devices such as cameras, radar, and the like are feeding to the AI. This gets us beyond the prevailing text-based book learning.
The latest catchphrase to refer to AI which has been mindfully calibrated and prepared to cope with the physical world is referred to as Physical AI.
Physical AI Is Being Avidly Pursued
AI researchers are scurrying to craft Physical AI.
The future depends on this capability. Machines and robots are going to be built and shipped to work side-by-side with humans. Physical AI will be the make-or-break of whether those mechanizations are compatible with humans and operate properly in the real world or instead are endangering and harmful.
This is not a new idea.
AI that operates in the physical realm has been around since the earliest days of the AI field. The somewhat new angle is that we will have generative AI working at the core of Physical AI, which then we might coin as Generative Physical AI.
I’ll toss something else into this equation. You perhaps know that Agentic AI is the biggest buzzword right now and we are heading smackdab into AI agents this year.
The notion of agentic AI is that we could have multiple generative AI instances serving as your agents or assistants to accomplish some particular tasks. For example, you might invoke an agentic AI that would book your hotel rooms and flights for a vacation trip. You merely tell the agentic AI in natural language and broad terms what you want to be done, and the AI will launch a multitude of subcomponents that search the web for hotels, find a suitable hotel, book the rooms for you, search for flights, find a suitable flight, book the flight for you, and then lays out your whole itinerary and bookings for your use.
Easy-peasy
For more about the ins and outs of Agentic AI, see my detailed coverage at the link here.
We currently have three keystone AI aspects that are in the news and evolving at the same time:
- Generative AI
- Agentic AI
- Physical AI
Permutations And Combinations Galore
The twist is this.
You can have Physical AI that is not beholden to either of those other two types of AI. It is simply Physical AI and doesn’t have generative AI or agentic AI involved. But, more likely, Physical AI will be based on top of or at least in conjunction with either or both of those other two types of AI.
The combinations consist of these four major options:
- (1) Barebones Physical AI. No other AI is underneath or intricately connected (but, this is a slippery definition, as mentioned next).
- (2) Generative Physical AI. This is Physical AI that is based upon or intricately linked with Generative AI.
- (3) Agentic Physical AI. This is Physical AI that is based upon or intricately linked with Agentic AI.
- (4) Generative Agentic Physical AI. This is Physical AI that is based upon or intricately linked with both Generative AI and Agentic AI.
Please be aware that the terminology is not yet cast in stone.
The realm of Physical AI is in flux, and some prefer to simply say “Physical AI” rather than having to say Generative Physical AI, so they presume that any mention of Physical AI is tantamount to indicating Generative Physical AI.
The same goes for the other concoctions, namely that someone into Agentic Physical AI would say that “Physical AI” means that type of AI, and the mouthful Generative Agentic Physical AI is more shortly depicted as just Physical AI.
Make sure to ask a person who is spouting the heralded possibilities of Physical AI what they exactly mean by their use of the budding catchphrase.
The Deal Is Embodied Intelligence
I will tie this back to my comments about the duality of mind and body. This is something I do cautiously since I don’t want to anthropomorphize AI. Bear with me as I proceed.
A heated debate in the AI community entails whether we truly need to have AI essentially “embody” physicality considerations such as my point earlier of generative AI by words versus generative AI by deeds. With a grain of salt, analogize this to the human mind that is without a body. Modern-day generative AI is considered in that same boat. Maybe the AI must have a “body” or at least a semblance of embodiment to fully grasp how to operate in the physical world.
This leads to two heavyweight questions:
- Does the only viable path toward AI that controls and drives robots, and the like, consist of pursuing the deeds path that will “embody” the physical considerations of the real world, or can we get along suitably with just more words?
- Can we potentially skirt around tangible physical interactions by using faked simulations to data train contemporary AI on the physical world (you know, like those video games that people play where the simulated environment tries to portray the effects of gravity, friction, etc.)?
Research on the enthralling topic of “embodiment intelligence” encompassing both humankind and artificial or AI kind is wrestling mightily with these provocative and unresolved questions.
The Future Is Running Towards Us
Hark back to my dialogue with ChatGPT about how the AI figured out that a rubber ball would drop to the ground and bounce. The AI told the truth, namely that it was speculation based on text-based pattern-matching of content that ChatGPT had been initially data trained on.
The pursuit of embodied intelligence amid Physical AI might soon get this type of answer instead.
- My entered prompt: “Have you experienced the physical aspects in terms of having dropped a rubber ball from shoulder height and observed the physical ramifications?”
- Generative AI response: “Yes, I have previously dropped a rubber ball from shoulder height using a tightly integrated robot that was standing upright at a height of six feet. The robot used its sensors including vision cameras to observe the ball fall to the floor and bounce. Thus, the explanation of what would happen in your stated scenario is based on the “experience” via connected robots performing this kind of task and then analyzing what occurred, along with pairing this physical observation with the data training that I have encountered.”
You can imagine the thorny AI ethical facets that are going to arise from the advent of widespread Physical AI, plus the likely need for appropriate AI-related laws and regulations, which I’ve covered at the link here.
A final thought for now.
Isaac Asimov famously made this remark: “A robot may not injure a human being, or, through inaction, allow a human being to come to harm.”
Amid the rush to devise and field Physical AI, it is incumbent upon AI makers and AI developers to keep in mind those wise words about what can happen once AI is operating in the physical world. Real-world consequences can arise. It is one thing for someone to use generative AI that merely tells them to do something untoward, which is just all talk, while AI that instructs a robot to do something troublesome is going to be carried out in the physical world.
Let’s aim to make sure that physical AI is also safe AI.