MarkTechPost@AI 01月05日
FutureHouse Researchers Propose Aviary: An Extensible Open-Source Gymnasium for Language Agents
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Aviary是一个开源的语言智能体训练平台,旨在解决AI模型在复杂科学任务中面临的挑战。它引入了语言决策过程(LDPs)的概念,将任务建模为部分可观察的马尔可夫决策过程,使智能体能够处理多步骤推理任务。Aviary包含分子克隆、科学文献问答和蛋白质稳定性工程等多个环境,为训练和评估语言智能体提供了宝贵的平台。该平台还采用了专家迭代(EI)和多数投票等技术,提高了模型的准确性和效率。研究结果表明,非前沿的开源模型在这些任务中表现出色,且成本更低,为大规模科学应用提供了新的可能性。

🔬Aviary引入语言决策过程(LDPs),将任务建模为部分可观察的马尔可夫决策过程,使语言智能体能够处理复杂、多步骤的推理任务。

🧪Aviary包含分子克隆、科学文献问答和蛋白质稳定性工程等多个环境,这些环境模拟了现实世界的科研挑战,有助于训练和评估语言智能体。

💡Aviary采用专家迭代(EI)方法,通过高质量的轨迹迭代优化智能体;同时,利用多数投票技术,在不增加计算负担的情况下提高准确性。

📚研究表明,Llama-3.1-8B-Instruct等非前沿开源模型在Aviary环境中表现优异,性能可与甚至优于前沿模型,且推理成本显著降低。

🎯Aviary通过提供工具集成,如序列注释器和文献检索系统,增强了语言智能体在实际应用中的能力,并为AI驱动的科学探索奠定了基础。

Artificial intelligence (AI) has made significant strides in developing language models capable of solving complex problems. However, applying these models to real-world scientific challenges remains difficult. Many AI agents struggle with tasks requiring multiple cycles of observation, reasoning, and action. Moreover, existing models often lack the ability to integrate tools effectively or maintain consistency in multi-step reasoning. These issues are particularly pressing in scientific domains, where tasks demand precision, adaptability, and computational efficiency. Addressing these problems requires a flexible and practical framework for training and deploying language agents.

Introducing Aviary: An Extensible Open-Source Gymnasium

A team of researchers from FutureHouse Inc., the University of Rochester, and the Francis Crick Institute has introduced Aviary, an open-source gymnasium for language agents. Aviary addresses the limitations of existing frameworks by introducing language decision processes (LDPs), which model tasks as partially observable Markov decision processes grounded in natural language. This approach enables language agents to effectively handle complex, multi-step reasoning tasks.

Aviary includes five environments, three of which are designed for advanced scientific tasks:

    Molecular Cloning: Manipulating DNA constructs using tools for sequence annotation and protocol planning.Scientific Literature QA: Retrieving and analyzing scientific literature to answer detailed research questions.Protein Stability Engineering: Proposing protein mutations to improve stability with the help of computational and biochemical tools.

These tasks make Aviary a valuable platform for training and evaluating language agents in real-world scenarios requiring reasoning, tool integration, and iterative learning.

Technical Insights and Benefits of Aviary

Aviary uses a stochastic computation graph framework to model language agents, enabling flexible and efficient optimization. Key features include:

The researchers show that non-frontier, open-source models like Llama-3.1-8B-Instruct can achieve performance comparable to or better than frontier models (e.g., Claude 3.5 Sonnet) in these environments. Additionally, these models operate at significantly lower inference costs, making them accessible for large-scale scientific applications.

Results and Insights

Aviary-trained agents demonstrate impressive performance:

Conclusion

Aviary represents a thoughtful advancement in the development of language AI agents. By demonstrating that open-source, non-frontier models can excel in scientific tasks, Aviary opens new possibilities for accessible and cost-effective AI research. Its open-source design encourages collaboration, enabling researchers and developers to refine and extend its applications further.

With tools and training methods tailored for real-world challenges, Aviary sets a benchmark for how language agents can address complex tasks. It provides a compelling framework for advancing AI-driven scientific exploration and practical problem-solving.


Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

FREE UPCOMING AI WEBINAR (JAN 15, 2025): Boost LLM Accuracy with Synthetic Data and Evaluation IntelligenceJoin this webinar to gain actionable insights into boosting LLM model performance and accuracy while safeguarding data privacy.

The post FutureHouse Researchers Propose Aviary: An Extensible Open-Source Gymnasium for Language Agents appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Aviary 语言智能体 开源 科学应用 AI
相关文章