According to a research team at Apple, AI is not as advanced as experts have been led to believe. In other words, AI can’t in fact “think.”
The AI research team at Apple tested 20 of the most popular large language models (LLMs) such as GPT-4o, Llama, and Phi, and concluded that they are incapable of thinking. Instead, the research says that the AI LLMs have improved their ability to copy what they learned from training and repeated usage. The paper published by Apple in October comes as Apple integrates its own Apple AI into products.
“The fact that Apple did this has gotten a lot of attention, but nobody should be surprised at the results,” said AI critic Gary Marcus.
The fact that AI copies and recognizes patterns from its training instead of thinking means that it struggles to deal with new problems. Since AI is incapable of reasoning and analyzing, it instead falls back on patterns from training. This makes it look like AI is analyzing and solving new issues on the fly when it is actually just imitating reasoning without truly understanding or learning.
When the AI LLMs are doing what they were trained to do, they are spectacular at getting the job done. However, when faced with something outside of the data they were fed, they can sometimes be rendered utterly helpless.
When presenting AI with a simple math equation a child is capable of analyzing and answering, a simple change of name or addition of unrelated information serving as a distractor resulted in about a 10 percent change in LLM responses.
For instance, the following scenario might be presented to one of the various AI LLM models: If Addison had 8 apples in the bucket in her left hand and 10 apples in the other bucket in her right hand and the apples are of different colors and sizes, how many apples did Sally have in total?
There is a good chance that the extra information describing the apples Addison had in each bucket would throw off the AI LLM model completely, providing a user with an incorrect answer.
“There are some problems which you can make a bunch of money on without having a perfect solution,” Marcus said to the LA Times’ Michael Hiltzik. “But a calculator that’s right only 85 percent of the time is garbage.”
According to Apple’s research, AI may be incapable of reasoning, but that doesn’t discount its ability to solve problems it is trained on. If one presents an AI LLM model a problem it was in fact trained on, the user is more than 95 percent likely to obtain an accurate response. In fact, the AI model will even deliver a comprehensive breakdown of the solution should it be required.
However, AI is less advanced than initially believed and has a long way to go before it can potentially do the work of humans and replace them.