少点错误 前天 00:22
What LLMs lack
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了大型语言模型(LLM)的局限性,认为这些局限与人类需要意识才能实现认知能力密切相关。文章指出,LLM在系统性思考、处理新情况、知识整合、问题解决中学习、记忆、客观性、能动性和认知控制等方面存在不足。这些不足与人类意识的信息整合能力有关。Transformer架构的限制在于信息整合空间相对较小,信息分散在token中。尽管LLM在特定领域表现出色,但在跨模态学习和复杂问题解决方面仍有差距。文章最后得出结论,LLM与通用人工智能(AGI)之间存在质的算法差距,并认为LLM可能不具备意识。

🤔 LLM的局限性与人类需要意识才能实现的认知能力相关,包括系统性思考、处理新情况、知识整合、问题解决中学习、记忆、客观性、能动性和认知控制。

🧠 人类意识通过信息整合实现认知能力,而LLM在信息整合方面存在局限,主要体现在整合空间较小,信息分散在token中。人类大脑用于意识的信息整合可能涉及数百万神经元,而LLM的token激活仅有数千到数万个条目。

🌐 LLM在跨模态学习方面表现不佳,因为不同模态的token无法自然共享表示空间。例如,LLM可能擅长下棋,但无法有效地教授棋艺,因为其对棋局的描述往往缺乏连贯性。

🤖 Transformer架构的限制是LLM在信息整合方面存在差距的关键原因。Transformer通过预测下一个token进行训练,并将信息整合到一个相对较小的向量中,这限制了其在复杂认知任务中的表现。

Published on May 28, 2025 4:19 PM GMT

Introduction

I have long been very interested in the limitations of LLMs because understanding them seems to be the most important step to getting timelines right. 

Right now there seems to be great uncertainty about timelines, with very short timelines becoming plausible, but also staying hotly contested. 

This led me to revisit LLM limitations and I think I noticed a pattern that somehow escaped me before. 

Limitations

To recap, these seem to be the most salient limitations or relative cognitive weaknesses of current models: 

System 2 thinking: Planning, see the ongoing weird difficulty to get it to play TicTacToe perfectly or block world, chess, anything that has not been subject of a lot of reasoning RL. 

Dealing with new situations: Going out of distribution is a killer for all things DL. 

Knowledge integration: Models don't have automatic "access" to skills learned from separate modalities. Even within the same modality skills are not robustly recallable, hence the need for prompting. Also related: Dwarkesh's question

Learning while problem solving: Weights are frozen and there is no way to slowly build up a representation of a complex problem if the representations that have already been learned are not very close already. This is basically knowledge integration during inference. 

Memory: RAGs are a hack. There is no obvious way to feed complex representations back into the model, mostly because these aren't built in the first place - the state of a transformer is spread over all the token and attention values, so recomputing those based on the underlying text is the go-to solution.

Objectivity: See hallucinations. But also self-other/fact-fantasy distinction more generally.

Agency: Unexpectedly we got very smart models that are not very good at getting stuff done.

Cognitive control: The inability to completely ignore irrelevant information or conversely set certain tenets absolute leads to jailbreaks, persistent trick question failures and is also a big part of the unreliability of models. 

One category

These seem like a mixed bag of quite different things, but I recently realised that they all belong to the same class of cognitive abilities: These are all abilities that in humans are enabled by and in fact require consciousness. 

Is "cognitive abilities enabled by consciousness" maybe a bit tautological? Unconscious people show little cognitive ability after all? 

But humans can do many cognitively demanding things without being conscious of them at that moment. The simplest example is driving a well known route and arriving without any memory of the drive, which probably happened to most of us.

Not having a memory of them is a tell, that we weren't conscious of the drive, but probably attending consciously to something else, because conscious experience is necessary for memory formation.

Does this make sense?

The IIT or the global workspace theory tell us that consciousness is about information integration. Different sensory information and the results of subconscious processing are integrated into the coherent whole of what we are conscious of. The coherence of our experience tells us that the information is integrated and not just made available. 

Knowledge integration, learning while problem solving and memory are all about integrating information into one coherent whole, while the rest of the limitations touch upon abilities that are based on the manipulation of the integrated information. 

Transformers, as they are currently trained, are limited when it comes to information integration for two reasons: 

    The space into which information is integrated is comparatively small. While the brain subnetworks that holds the information that we are conscious of contain probably at least hundreds of millions of neurons, the final token activation that is used to make a decision aka the next token prediction contains only a couple of thousand (for the largest models possibly a few ten thousand) entries. Information is splintered into tokens. How this is relevant can be seen when we notice that there are cases of the models being able to do impressive information integration during learning: Already GPT-2 was able to translate between major languages despite seeing very little translation data. This is possible because languages all split into comparable tokens and so models can learn to use shared representation spaces of these tokens (Think very similar representations of "dog", "chien", "hund" etc). This breaks down for different modalities where tokens don't naturally share representation spaces, which is why while models might beat you at chess, they cannot teach you chess, because what they say about a game is mostly nonsense. 

The correspondence between "stuff LLMs tend to be comparatively bad at" and "stuff humans need conscious processing for" therefore seems to make sense based on the transformer architecture + data + training. (For what it's worth, I don't think state-space-models come out much ahead here, because they are also trained on next token prediction and integrate into a comparatively tiny vector.)

Conclusion

To my mind this satisfyingly delineates the dimensions along which LLMs are still lagging from those where they forge ahead. I don't think this is a very actionable insight, neither in terms of achieving AGI nor in terms of getting a clearer picture of timelines. 

However it does make it clearer to me that there really is a qualitative algorithmic gap to AGI and it also convinces me that LLMs are probably not (very) conscious. 



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLM局限性 人类意识 信息整合 Transformer架构 认知能力
相关文章