TechCrunch News 2024年10月17日
Meta’s AI chief says world models are key to ‘human-level AI’ — but it might be 10 years out
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了当前AI模型的局限性,如虽有一些进展但远未达到人类水平,同时介绍了Yann LeCun提出的世界模型概念,阐述其原理、优势及面临的挑战,还提及多个AI实验室对世界模型的研究情况。

🧠当前AI系统的局限性:如语言模型是一维预测器,AI图像/视频模型是二维预测器,它们虽在各自维度预测表现良好,但无法真正理解三维世界,难以完成人类能轻松完成的简单任务。

🌟世界模型的概念及原理:是对世界行为的心理模型,能根据输入信息预测世界的样子,通过想象一系列可能的行动及其效果来制定实现目标的行动计划。

💪世界模型的优势与挑战:可接收比LLMs更多的数据,但计算强度大;多个AI实验室在研究,但实现该系统面临诸多困难,需要多年甚至十年的时间。

Are today’s AI models truly remembering, thinking, planning, and reasoning, just like a human brain would? Some AI labs would have you believe they are, but according to Meta’s chief AI scientist Yann LeCun, the answer is no. He thinks we could get there in a decade or so, however, by pursuing a new method called a “world model.”

Earlier this year, OpenAI released a new feature it calls “memory” that allows ChatGPT to “remember” your conversations. The startup’s latest generation of models, o1, displays the word “thinking” while generating an output, and OpenAI says the same models are capable of “complex reasoning.”

That all sounds like we’re pretty close to AGI. However, during a recent talk at the Hudson Forum, LeCun undercut AI optimists, such as xAI founder Elon Musk and Google DeepMind co-founder Shane Legg, who suggest human-level AI is just around the corner.

“We need machines that understand the world; [machines] that can remember things, that have intuition, have common sense, things that can reason and plan to the same level as humans,” said LeCun during the talk. “Despite what you might have heard from some of the most enthusiastic people, current AI systems are not capable of any of this.”

LeCun says today’s large language models, like those which power ChatGPT and Meta AI, are far from “human-level AI.” Humanity could be “years to decades” away from achieving such a thing, he later said. (That doesn’t stop his boss, Mark Zuckerberg, from asking him when AGI will happen, though.)

The reason why is straightforward: those LLMs work by predicting the next token (usually a few letters or a short word), and today’s image/video models are predicting the next pixel. In other words, language models are one-dimensional predictors, and AI image/video models are two-dimensional predictors. These models have become quite good at predicting in their respective dimensions, but they don’t really understand the three-dimensional world.

Because of this, modern AI systems cannot do simple tasks that most humans can. LeCun notes how humans learn to clear a dinner table by the age of 10, and drive a car by 17 – and learn both in a matter of hours. But even the world’s most advanced AI systems today, built on thousands or millions of hours of data, can’t reliably operate in the physical world.

In order to achieve more complex tasks, LeCun suggests we need to build three dimensional models that can perceive the world around you, and center around a new type of AI architecture: world models.

“A world model is your mental model of how the world behaves,” he explained. “You can imagine a sequence of actions you might take, and your world model will allow you to predict what the effect of the sequence of action will be on the world.”

Consider the “world model” in your own head. For example, imagine looking at a messy bedroom and wanting to make it clean. You can imagine how picking up all the clothes and putting them away would do the trick. You don’t need to try multiple methods, or learn how to clean a room first. Your brain observes the three-dimensional space, and creates an action plan to achieve your goal on the first try. That action plan is the secret sauce that AI world models promise.

Part of the benefit here is that world models can take in significantly more data than LLMs. That also makes them computationally intensive, which is why cloud providers are racing to partner with AI companies.

World models are the big idea that several AI labs are now chasing, and the term is quickly becoming the next buzzword to attract venture funding. A group of highly-regarded AI researchers, including Fei-Fei Li and Justin Johnson, just raised $230 million for their startup, World Labs. The “godmother of AI” and her team is also convinced world models will unlock significantly smarter AI systems. OpenAI also describes its unreleased Sora video generator as a world model, but hasn’t gotten into specifics.

LeCun outlined an idea for using world models to create human-level AI in a 2022 paper on “objective-driven AI,” though he notes the concept is over 60 years old. In short, a base representation of the world (such as video of a dirty room, for example) and memory are fed into an world model. Then, the world model predicts what the world will look like based on that information. Then you give the world model objectives, including an altered state of the world you’d like to achieve (such as a clean room) as well as guardrails to ensure the model doesn’t harm humans to achieve an objective (don’t kill me in the process of cleaning my room, please). Then the world model finds an action sequence to achieve these objectives.

Meta’s longterm AI research lab, FAIR or Fundamental AI Research, is actively working towards building objective-driven AI and world models, according to LeCun. FAIR used to work on AI for Meta’s upcoming products, but LeCun says the lab has shifted in recent years to focusing purely on longterm AI research. LeCun says FAIR doesn’t even use LLMs these days.

World models are an intriguing idea, but LeCun says we haven’t made much progress on bringing these systems to reality. There’s a lot of very hard problems to get from where we are today, and he says it’s certainly more complicated than we think.

“It’s going to take years before we can get everything here to work, if not a decade,” said Lecun. “Mark Zuckerberg keeps asking me how long it’s going to take.”

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI 世界模型 Yann LeCun AI局限性
相关文章