MarkTechPost@AI 02月26日
Convergence Releases Proxy Lite: A Mini, Open-Weights Version of Proxy Assistant Performing Pretty Well on UI Navigation Tasks
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Proxy Lite是Convergence推出的一款迷你、开源的Proxy助手,旨在为开源社区提供强大的Web自动化功能。它是一个3B参数的视觉-语言模型,在效率和可靠性之间取得了平衡。Proxy Lite的设计透明开放,鼓励社区探索、修改和改进其框架。它集成了视觉-语言模型和浏览器交互系统,可以对浏览器任务进行细致的控制,适用于从数据提取到复杂导航等多种Web任务,同时保持较低的资源使用率。通过模仿人类的推理过程,Proxy Lite在简单性和复杂性之间取得了平衡。

🚀 Proxy Lite基于Qwen2.5-VL-3B-Instruct,是一个3B参数的模型,致力于在性能和效率之间取得平衡。它通过观察网页状态、思考下一步行动、执行精确命令的三阶段过程来生成响应,从而提高任务可靠性和泛化能力。

🌐 Proxy Lite在WebVoyager基准测试中取得了72.4%的总体评分,表现出色。在Allrecipes、Amazon、Apple和GitHub等网站上,Proxy Lite的成功率分别达到了87.8%、70.0%、80%以上,展示了其在不同环境下的可靠性。

💡 Proxy Lite的设计支持直接集成到命令行界面和Streamlit应用程序中,使得部署变得容易,即使对于那些技术资源有限的人也是如此。其模块化设计邀请了协作和持续开发,为学术研究和商业项目提供了一个有价值的资源。

In today’s digital landscape, automating interactions with web content remains a nuanced challenge. Many existing solutions are resource-intensive and tailored for narrowly defined tasks, which limits their broader applicability. Developers often face the dual challenge of balancing computational efficiency with the need for a model that can generalize well across diverse websites. Traditional systems, heavily reliant on prompt-prediction, often lack the reflective reasoning required for the unpredictable nature of web environments. Additionally, proprietary models typically restrict access to detailed inner workings, making it difficult for researchers and practitioners in the open-source community to build on state-of-the-art methods. These persistent issues underline the importance of developing an automation tool that is both efficient and accessible.

Convergence has introduced Proxy Lite: a mini, open-weights version of their well-regarded Proxy assistant. This 3B parameter Vision-Language Model is designed to extend sophisticated web automation capabilities to the open-source community. Rather than promising extraordinary feats, Proxy Lite aims to offer a balanced approach that marries efficiency with reliability. Its architecture builds on a solid foundation, allowing it to perform a variety of web-based tasks without imposing heavy computational demands.

What makes Proxy Lite notable is its transparent design and open-weights approach. This encourages the community to explore, modify, and improve upon its framework. With an integrated system for Vision-Language Model (VLM) and browser interactions, Proxy Lite allows for nuanced control over browser tasks. The model’s configuration supports practical applications ranging from routine data extraction to more complex navigational tasks, all while keeping resource usage in check.

Technical Aspects and Their Benefits

At its core, Proxy Lite leverages a 3B parameter model built on the Qwen2.5-VL-3B-Instruct foundation. This choice reflects a commitment to balancing performance with efficiency. The model employs a three-phase process to generate responses:

This structured approach not only improves task reliability but also facilitates the model’s ability to generalize across different types of web interactions. By mirroring human-like reasoning processes, Proxy Lite manages to strike a balance between simplicity and sophistication. Moreover, its design supports a straightforward integration into both command-line interfaces and Streamlit applications, making deployment accessible even for those with modest technical resources.

Performance Insights and Practical Evaluations

Proxy Lite has been carefully evaluated using the WebVoyager benchmark, a comprehensive set of tasks designed to test web automation capabilities. The model achieved an overall score of 72.4%, a strong performance indicator given its open-weights nature. Detailed performance statistics across various websites reveal its thoughtful design:

These findings reflect a balanced performance, with Proxy Lite efficiently managing tasks without the overhead typically associated with larger, proprietary models. The comprehensive evaluation not only underscores its current utility but also points to potential enhancements through community-driven refinements.

Conclusion

Proxy Lite emerges as a thoughtfully designed tool in the field of web automation. By addressing key challenges—such as resource constraints, generalization, and transparency—it offers a practical solution for automating routine online tasks. Its open-weights approach and modular design invite collaboration and ongoing development, providing a valuable resource for both academic research and commercial projects.


Check out the Technical Details and Model here. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 80k+ ML SubReddit.

Recommended Read- LG AI Research Releases NEXUS: An Advanced System Integrating Agent AI System and Data Compliance Standards to Address Legal Concerns in AI Datasets

The post Convergence Releases Proxy Lite: A Mini, Open-Weights Version of Proxy Assistant Performing Pretty Well on UI Navigation Tasks appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Proxy Lite Web自动化 开源 视觉-语言模型
相关文章