Artificial Fintelligence 2024年10月22日
Why do LLMs use greedy sampling?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了大型语言模型(LLM)在文本生成中为什么不使用搜索算法,并与游戏AI研究中的搜索方法进行了对比。作者分析了目前LLM使用贪婪采样的原因,并提出了搜索算法在LLM中应用的可能性,以及可能带来的优势。文章还探讨了OpenAI和DeepMind在AI研究中对搜索算法的关注,以及未来LLM发展中可能出现的趋势。

🤔 **贪婪采样:LLM 当前的文本生成方法** 大型语言模型(LLM)通常使用贪婪采样来生成文本序列。这种方法简单直接,在每个时间步选择概率最高的词语作为输出。然而,在游戏AI研究中,更复杂的搜索算法(如蒙特卡洛树搜索)通常被用于选择最佳行动。

🤔 **搜索算法的优势:游戏AI中的经验** 在游戏AI领域,搜索算法通常能够显著提升AI的性能。例如,AlphaZero 在围棋、国际象棋、将棋等游戏中取得了巨大成功,其关键在于它使用了蒙特卡洛树搜索。搜索算法可以探索更多的可能性,找到更优的行动方案,从而超越单纯的基于神经网络的模型。

🤔 **LLM 中使用搜索算法的可能性** 尽管目前LLM 主要使用贪婪采样,但一些研究表明搜索算法在LLM 中具有应用潜力。OpenAI 最近聘请了游戏AI领域搜索算法专家 Noam Brown,这暗示着他们可能正在探索在 LLM 中使用搜索算法。此外,DeepMind 的 Gemini 也可能借鉴了 AlphaGo 中的搜索技术。虽然目前尚不清楚 LLM 是否已经使用了搜索算法,但搜索算法在提升 LLM 性能方面具有巨大的潜力。

Why do LLMs use greedy sampling?

This is a speculative article, so I’d greatly appreciate any feedback, particularly if you disagree.

When I began working on LLMs, I found it pretty surprising that the SOTA in generative text was to greedily sample from the bare outputs of the neural networks. In other words, with GPT-style models, the typical approach to generate a sequence of text is something like:

    Run your prompt through your model and generate probabilities over your vocabulary.

    Choose the most likely token (perhaps with some randomization, maybe with some preprocessing, like top-k or nucleus sampling).

    If the chosen token is <|endoftext|>, you’re done; otherwise, concatenate the new token to your prompt and go back to 1.

In games RL research, it is common to instead conduct a much more complicated calculation to choose the next step in your sequence. For instance, AlphaZero uses a somewhat complicated algorithm called Monte Carlo Tree Search (MCTS). Here, I explore some reasons for why LLMs don’t use a fancier decoding algorithm.

But first, a caveat: there’s a lot of literature proposing various ways to do this that I’m not going to engage with, for sake of time. I have a list of references at the end which I’d encourage you to look at if you want a more detailed look.

The current paradigm of language modelling, with GPT-style decoder models, uses greedy autoregressive sampling to generate a sequence of tokens. This is a somewhat surprising choice; if you look at the history of NLP research, particularly the Neural Machine Translation literature, beam search is often needed to reach SOTA performance (e.g. 1703.03906). Similarly, in games research, search is typically many times stronger than any pure neural network approach, and search will strictly dominate wherever it’s feasible (the exceptions are games like Stratego, where the game tree has much too high of a branching factor to be searched with any non-trivial depth). In games like Go, Chess, Poker, or Scotland Yard, search methods dominate.

By search, I am referring to algorithmic search, which I am defining as any method which uses additional compute at inference time to improve the answer. This has nothing to do with Google-style search (which I call “information retrieval”).

So why don’t GPTs use search? Well, there’s a few answers to this. The first one is a total copout:

    GPTs don’t use search as far as we know. OpenAI recently raised my eyebrows when they hired Noam Brown, an expert on search in games, to work on “multi-step reasoning, self-play, and multi-agent AI.” That sounds an awful lot like search to me (and, specifically, sounds a lot like Alpha/MuZero). We also know that Demis has talked about Gemini incorporating techniques from AlphaGo, which, again, makes me think about search (not to mention self-play).

    So it’s entirely possible that search is the secret sauce behind GPT-4’s performance, and the lack of it is why the open source world has been unable to match it. I’m suspicious of this— for reasons I’ll get into below— but if I were actively working on LLM research, I’d be focusing on trying to use search.

Let’s assume, then, that the key players (GPT-4, Claude, etc.) aren’t using search. Why?

Read more

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大型语言模型 搜索算法 贪婪采样 游戏AI
相关文章