少点错误 04月27日 09:02
Open Source LLM Pokémon Scaffold
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了LLM宝可梦框架的一个开源版本,该框架在《研究笔记:在宝可梦红上运行Claude 3.7、Gemini 2.5 Pro和o3》中有所描述。相较于之前的版本,该框架进行了一系列改进,包括使用文本而非彩色方块来呈现信息,并直接在模型需要查看的位置显示相关信息。此外,还改进了提示,以提高模型对游戏状态的理解能力,并引入了检查点工具、详细导航工具和自动寻路工具。这些改进虽然有所帮助,但并未使LLM在宝可梦游戏中表现出色。

🗺️ 使用文本信息代替彩色方块:为了帮助LLM更好地理解游戏状态,框架不再使用彩色方块,而是直接在屏幕上以文本形式打印信息,例如“无法通行”、“已探索”和“检查这里”。这种方式更直接地呈现了相关信息,避免了模型通过图例或指令进行间接推断。

🤖 改进的提示策略:框架优化了提示,旨在提高LLM对游戏状态的理解能力。通过提供关于哪些信息来源是可信的指导,模型可以更好地把握现实,区分来自游戏RAM的数据、自身知识、地图标签和视觉信息的可靠性。

📍引入检查点工具:为了帮助模型维护游戏进程,框架引入了“mark_checkpoint”工具。该工具允许模型记录重要的检查点,例如“离开房子”、“击败小霞”和“死于小刚”等,从而更好地跟踪游戏进度。

🧭 集成详细导航和自动寻路工具:框架还集成了“detailed_navigation”工具和自动寻路工具。详细导航工具允许模型在不指定目标的情况下探索地图,与NPC交谈并离开地图。自动寻路工具则可以帮助模型前往地图上的已知坐标。

Published on April 27, 2025 12:57 AM GMT

This is a cleaned-up, open-source version of the LLM Pokémon Scaffold described in Research Notes: Running Claude 3.7, Gemini 2.5 Pro, and o3 on Pokémon Red. (forked from David Hershey of Anthropic's scaffold here, all development on top of that was done by my friend, not me)

Since that post, a number of changes have been made to the scaffold. The major ones are:

    Instead of using colored squares on the game screenshots, information is printed as text, ex. "Impassable", "Explored", "Check Here"
      Models are seemingly helped by putting relevant information blatantly in the spot they need to see it, rather than indirectly via a legend or instructions or whatever
      For some reason it helps if you write "CHECK HERE" on every unexplored tile.
    Automatically-updating ASCII collision map given to LLM
      Generated by codeUses numbers indicating how many moves away each tile is
      Behold, Pewter City.
    Improved prompts for "Critique Claude"/"Guide Gemini"/"Oversight o3"
      Prompt 1: Given a bunch of facts about the current game state and instructions on what is trustworthy and what's not, make a summary
        this is an attempt to get the model to grasp reality better, telling it what sources of information it should basically always trust (data from game's RAM), mostly trust (its own knowledge of the game from training), not trust (map labels it made itself), and mostly distrust (its own vision)
      Prompt 2: Look at output from prompt 1 and try to remove inconsistenciesPrompt 3: OK now talk to the model you're critiquing
    Models encouraged to use a "mark_checkpoint" tool to maintain a running list of major checkpoints (Left House, Beat Misty, died to Brock, etc.)"detailed_navigation" tool which, if called, calls an alternate model that basically rolls around trying to explore + DFS but isn't told what the goal is (but is told to talk to NPCs and exit maps)Autopathing tool that can travel to known coordinates on the map

 

All of this helps somewhat but doesn't make LLMs amazing at Pokémon by any means.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLM 宝可梦 开源框架 人工智能
相关文章