In today’s column, I will closely examine a crucial feature of OpenAI’s latest ChatGPT-like model known as o1. The newly released o1 seems to have a limitation or top-out which has been roundly bandied about on social media, producing some heated qualms and stoked controversy. To some extent, the presumed constraint can end up potentially shortchanging generated answers.
Let’s talk about it.
In case you need some comprehensive background about o1, take a look at my overall assessment in my Forbes column (see the link here). I subsequently posted a series of pinpoint analyses covering exceptional features, such as a new capability encompassing automatic double-checking to produce more reliable results (see the link here).
AI-Based Chain-of-Thought Reasoning Is The Feature
I’ll begin with a quick overview of AI-based chain-of-thought reasoning or CoT. This is the feature in o1 that purportedly contains the top-out of expressed concern.
When using conventional generative AI, there is research that heralds the use of chain-of-thought reasoning as a processing approach to potentially achieve greater results from AI. A user can tell the AI to proceed on a step-at-a-time basis, considered a series of logically assembled chain of thoughts, akin to how humans seem to think (well, please be cautious in overstating or anthropomorphizing AI). Using chain-of-thought in AI seems to drive generative AI toward being more systematic and not rush to derive a response. Another advantage is that you can then see the steps that were undertaken and decide for yourself by inspection whether the AI seemed to be logically consistent.
OpenAI’s latest model o1 takes this to an interesting extreme.
The AI maker has opted to always force o1 to undertake a chain-of-thought approach. The user cannot turn it off, nor sway the AI from doing a CoT. The upside is that o1 seems to do better on certain classes of questions, especially in the sciences, mathematics, and programming or coding tasks. A downside is that the extra effort means that users pay more and must wait longer to see the generated results.
Chain-Of-Thought And How Far It Can Go
There is the famous one-liner by actor Clint Eastwood that a person has got to know their limitations. In a modern era, we can upgrade the line to say that AI has got to know its limitations too.
I’ll start by discussing human limitations and make my way to AI limitations.
One limitation that humans have is how deep they can mentally go when devising a semblance of logical steps in their heads. For example, suppose that you are playing chess. You might first think about your immediate next move. If you are any good at chess, you likely aim to anticipate the countermove that your opponent might take. Given that potential countermove, which hasn’t yet happened and is only in your mind, you might want to consider your second move that responds to that countermove.
On and on this goes, each of a series of chess moves plays out in your mind. It is all imaginary and a means of trying to figure out whether your immediate move at hand is going to be valuable in the long run. The world’s best chess players can envision many moves ahead. An everyday chess player is lucky if they can keep a handful of anticipated moves in their mind at one time. An occasional chess player might only be able to do a look-ahead of one or two moves.
The crux is that mentally there are only so many steps that human minds generally seem to be able to retain for a given mental exercise.
Why might this be an issue or problem?
Because you at some point cut off the lookahead and must go with whatever you were able to mentally formulate. That’s partially why an expert chess player can beat less seasoned chess players. The top pros can mentally formulate options much further ahead and thus produce a better next move based on the sizable analysis they’ve undertaken in their noggin.
Coping With Limitations On Looking Ahead In Your Mind
Yes, a person does need to know their limitations.
If you can identify how much of a look-ahead you can mentally do, this is something you should be considering when trying to solve problems. Different types of problems might necessitate different depths of mental formulation. Perhaps you know a lot about cars. Your ability to do lookaheads when trying to fix a car problem might be pretty high. Suppose you rarely play chess. Your capability of looking ahead in chess might be quite low.
The results of your problem-solving are bound to be shaped by those limitations.
Ultimately, when you reach a boundary of your contextually specific maximum look-ahead, many people start to lose their train of thought. If you are only adept at looking four steps ahead in chess, and yet try to reach ten steps, the odds are that the steps beyond the first four are going to get jumbled up. You can’t keep those remaining steps straight in your mind. This can leak over into the first four steps and the entire line of reasoning begins to fall apart.
Consider two ways to possibly cope with this:
- Bad approach: Get caught off-guard, keep mentally stepping forward, and your train of thought goes awry.
- Good approach: Anticipate your step size limits and stop at your self-determined maximum.
When you approach a problem, an astute problem solver tries to anticipate how many mental steps they can handle for the situation at hand. You seek to gauge whether the step depth is going to be sufficient. If a circumstance requires fifteen steps and you can only do five, you know that this is going to compromise your effort. Either don’t proceed or aim to find some alternative means of solving the problem. The other choice is to go ahead anyway and know that you are going to shortchange what seemingly needs to be mentally performed.
I believe maybe that’s what Clint Eastwood was trying to say, though he did so quite eloquently in a memorable one-liner.
AI-Based Chain-Of-Thought Getting To Its Limits
Shifting gears, I’d like to talk about AI-based chain-of-thought and utilize a similar notion that there are limits to the number of steps that an AI system has been devised to handle. One thing before we leap into the foray is that I am not suggesting that human thought and AI computational formulations are the same. They aren’t. Do not inadvertently anthropomorphize AI due to this analogous consideration.
Suppose you ask a generative AI app to solve a problem. If the AI has been devised to make use of a chain-of-thought approach, a series of steps will be calculated. After formulating the steps, an answer will presumably be figured out. You are shown the answer and can potentially see the steps that were used to arrive at the response.
How many steps or look-ahead can the generative AI do?
Well, it depends.
The AI developers likely had to make tough decisions on how far this might reach. Generally, the more there are steps, the more computational processing is required. This chews up the server running the AI. Users will need to pay for the expensive computing cycles and possibly the amount of computer memory utilized during the processing.
At some point, the AI builders have set some boundaries. Without any boundaries, the AI is going to keep whirling and calculating, possibly until the end of time. People won’t wait that long for answers. AI has got to know its limitations.
You might be curious about the number of steps that might be needed to solve a given problem. One of my favorite examples is a mathematical proof that was noted by the Guinness Book of World Records for its enormous size on the number of proof steps. Often referred to as the Enormous Theorem, the proof consists of 10,000 to 15,000 pages and has at times dozens or more steps noted on each page.
Potential Impacts On o1 Chain-Of-Thought
Various social media postings have been pointing out that there are times at which o1 appears to max out when it comes to the limits or allowed size of a derived chain-of-thought.
The proprietary nature of o1 means that OpenAI has not divulged all the particulars of what is going on inside the AI. Speculation has arisen about what the size limit for a generated chain-of-thought is. Some say it is in the low hundreds of steps. Others note that the size limit might be impacted by the complexity of the steps. For example, it could be that if simpler steps are involved, the size can stretch to much higher skies, such as into the thousands or more.
Another point to be made is that this is probably something that can be established as a parameter. Someone willing to pay more and get more extensive chain-of-thought processing might be able to have their parameters set accordingly. A cynical retort is that maybe the inner mechanisms can only handle a certain size, thus, there is an upper bound and the parameter setting can only be set so high.
The good news is that this type of limitation can be studied and analyzed. Even without having access to the inner mechanisms, an AI empiricist can perform various experiments to try and discern the boundaries. I am fully expecting to see such research studies be undertaken and posted soon. I’ll keep you apprised.
Why This Matters
A big question is what happens once the AI reaches whatever maximum is possible.
One possibility is that the AI keeps going but starts to get into deep trouble due to exceeding various data structure limits and technical considerations. Errors might creep into the process. A chain-of-thought might go off the rails. The AI might not computationally detect this. Thus, it is conceivable that an answer will be reached that is wrong or messed up. The answer could be displayed to the user, and they might be blissfully unaware of how the AI reached a tilting stage.
Suppose that to avoid that possibility, the AI is set to stop at an internally noted limit. That seems sensible and fair. The thing is, will the user be made aware of this? In other words, you could suggest that the user is going to get an answer that was shortchanged due to stopping at a boundary that otherwise is somewhat arbitrary. Had the AI kept going, perhaps a better answer would have been derived.
Do you think that users of generative AI should be made aware of whether a chain-of-thought was topped out and curtailed due to an imposed size limitation?
One argument is that this ought to be shown so that users know what is taking place. Be upfront and transparent. Others insist this is not something worthy of worrying users about. It is an internal factor. Don’t confuse them with nitty-gritty details. Just answer their questions and move on.
Behind The Scenes Versus What You See
A related consideration entails what is displayed for the chain-of-thought.
Allow me to explain.
There is a twist associated with o1 and the chain-of-thought feature. You see, there is a hidden or raw chain-of-thought that the AI maker has decided they don’t want you to see (for my coverage on this, see the link here).
This is kept hidden from view for various reasons, including that maybe some could ferret out the secret sauce of o1 by cunningly examining the true chain-of-thought at play. So, instead, the chain-of-thought you see is a summarized or transformed chain-of-thought. It is some representation of the actual chain-of-thought but we don’t know how it maps to the raw chain-of-thought.
Going back to the limitation’s aspects, suppose the raw chain-of-thought is capped at some step numbered M. Suppose further that the displayed chain-of-thought shows steps N, which is less than the steps M in size. You might not realize that the true chain-of-thought hit the limit. All you are being shown is the made-up chain-of-thought. In that sense, the displayed chain-of-thought is not a viable gauge that showcases a limit has been reached.
The other side of the coin is possible too. Suppose the internal chain-of-thought went to some number of steps R. Let’s pretend that’s not at a limit and was just the number of steps required to solve the problem. The displayed chain-of-thought might be devised to show a number of steps S, which let’s say is a lot higher than R. It appears that many more steps took place. Thus, again, you cannot easily judge what the true internal limit is.
A conundrum.
Your Crucial Takeaways
Always inspect the AI answer generated and likewise closely inspect the displayed AI chain-of-thought.
It could be that you’ll observe some steps that don’t seem right. There is a chance that those steps or missteps arose due to the chain-of-thought losing its train of thought. The answer generated then ought to be especially suspect.
What can you do?
Well, if you have a problem that you anticipate might require loads and loads of steps, consider potentially breaking down the matter into subsets. Do each subset, one at a time. You are less likely to come across any boundary limits. Of course, that’s not always the case. Furthermore, problematically, sometimes breaking down a huge problem into subsets is not feasible or will otherwise not lend itself to producing a final answer of interest.
At least you are aware that limitations might enter into what is taking place with the AI.
Humans have their limitations. AI has its limitations. Limitations are aplenty. I realize that might seem gloomy. Let’s not end this discussion on a sour note.
This might be more uplifting. The legendary yoga master B.K.S. Iyengar said this about limits: “We can rise above our limitations, only once we recognize them.” You see, that’s the ticket, just make sure we know of limitations and be willing to recognize them.
The subsequent steps are up to us.