MarkTechPost@AI 05月07日 12:30
This AI Paper Introduce WebThinker: A Deep Research Agent that Empowers Large Reasoning Models (LRMs) for Autonomous Search and Report Generation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

WebThinker是由中国人民大学、北京智源人工智能研究院和华为诺亚方舟实验室的研究人员提出的深度研究代理。该代理旨在增强大型推理模型(LRMs)在复杂信息检索和生成科学报告方面的能力。WebThinker通过结合LRMs的推理能力与网络信息探索,使模型能够自主搜索网络、导航网页并撰写研究报告。实验结果表明,WebThinker在解决复杂问题和生成科学报告方面均优于现有方法,为LRMs在知识密集型任务中的应用提供了新的可能性。

🌐 WebThinker的核心是Deep Web Explorer模块,它允许LRMs在推理过程中动态地搜索、导航和提取网络信息,以弥补知识空白。

🧠 该框架采用“自主思考-搜索-草拟”策略,使模型能够流畅地结合推理、信息收集和报告撰写。同时,结合基于强化学习的训练策略,通过迭代在线直接偏好优化来增强研究工具的利用率。

🚀 WebThinker在两种主要模式下运行:问题解决模式和报告生成模式。在问题解决模式下,WebThinker使用Deep Web Explorer工具处理复杂任务;在报告生成模式下,LRM自主生成详细报告并使用辅助LLM实现报告撰写工具。

Large reasoning models (LRMs) have shown impressive capabilities in mathematics, coding, and scientific reasoning. However, they face significant limitations when addressing complex information research needs when relying solely on internal knowledge. These models struggle with conducting thorough web information retrieval and generating accurate scientific reports through multi-step reasoning processes. So, the deep integration of LRM’s reasoning capabilities with web information exploration is a practical demand, initiating a series of deep research initiatives. However, existing open-source deep search agents use RAG techniques with rigid, predefined workflows, restricting LRMs’ ability to explore deeper web information and hindering effective interaction between LRMs and search engines.

LRMs like OpenAI-o1, Qwen-QwQ, and DeepSeek-R1 enhance performance through extended reasoning capabilities. Various strategies have been proposed to achieve advanced reasoning capabilities, including intentional errors in reasoning during training, distilled training data, and reinforcement learning approaches to develop long chain-of-thought abilities. However, these methods are fundamentally limited by their static, parameterized architectures that lack access to external world knowledge. RAG integrates retrieval mechanisms with generative models, enabling access to external knowledge. Recent advances span multiple dimensions, including retrieval necessity, query reformulation, document compression, denoising, and instruction-following.

Researchers from Renmin University of China, BAAI, and Huawei Poisson Lab have proposed a deep research agent called WebThinker that empowers LRMs to autonomously search the web, navigate web pages, and draft research reports during the reasoning process. WebThinker introduces a Deep Web Explorer module that enables LRMs to dynamically search, navigate, and extract information from the web when they encounter knowledge gaps. It employs an Autonomous Think-Search-and-Draft strategy, allowing models to combine reasoning, information gathering, and report writing in real time smoothly. Moreover, an RL-based training strategy is implemented to enhance research tool utilization through iterative online Direct Preference Optimization.

WebThinker framework operates in two primary modes: Problem-Solving Mode and Report Generation Mode. In Problem-Solving Mode, WebThinker addresses complex tasks using the Deep Web Explorer tool, which the LRM can invoke during reasoning. In Report Generation Mode, the LRM autonomously produces detailed reports and employs an assistant LLM to implement report-writing tools. To improve LRMs with research tools via RL, WebThinker generates diverse reasoning trajectories by applying its framework to an extensive set of complex reasoning and report generation datasets, including SuperGPQA, WebWalkerQA, OpenThoughts, NaturalReasoning, NuminaMath, and Glaive. For each query, the initial LRM produces multiple distinct trajectories.

The WebThinker-32B-Base model outperforms prior methods like Search-o1 across all benchmarks on complex problem-solving, with 22.9% improvement on WebWalkerQA and 20.4% on HLE. WebThinker achieves the highest overall score of 8.0, surpassing RAG baselines and advanced deep research systems in scientific report generation tasks, including Gemini-Deep Research (7.9). The adaptability across different LRM backbones is remarkable, with R1-based WebThinker models outperforming direct reasoning and standard RAG baselines. With the DeepSeek-R1-7B backbone, it achieves relative improvements of 174.4% on GAIA and 422.6% on WebWalkerQA compared to direct generation, and 82.9% on GAIA and 161.3% on WebWalkerQA over standard RAG implementations.

In conclusion, researchers introduced WebThinker, which provides LRMs with deep research capabilities, addressing their limitations in knowledge-intensive real-world tasks such as complex reasoning and scientific report generation. The framework enables LRMs to autonomously explore the web and produce comprehensive outputs through continuous reasoning processes. The findings highlight WebThinker’s potential to advance the deep research capabilities of LRMs, creating more powerful intelligent systems capable of addressing complex real-world challenges. Future work includes incorporating multimodal reasoning capabilities, exploring advanced tool learning mechanisms, and investigating GUI-based web exploration.


Check out the Paper. Also, don’t forget to follow us on Twitter.

Here’s a brief overview of what we’re building at Marktechpost:

The post This AI Paper Introduce WebThinker: A Deep Research Agent that Empowers Large Reasoning Models (LRMs) for Autonomous Search and Report Generation appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

WebThinker 大模型 深度研究 AI Agent
相关文章