MarkTechPost@AI 01月13日
Researchers from Fudan University and Shanghai AI Lab Introduces DOLPHIN: A Closed-Loop Framework for Automating Scientific Research with Iterative Feedback
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

DOLPHIN是由复旦大学和上海人工智能实验室开发的闭环自动研究框架,涵盖整个科研过程,能提高效率和准确性,在多个任务中表现出色。

🎯DOLPHIN是涵盖科研全过程的闭环框架,可自动生成想法、执行实验并纳入反馈。

💡其分为三个阶段,包括检索并排名相关论文、生成并精炼研究想法、进行实验验证。

🎉DOLPHIN在多个基准任务中表现优异,能提高效率,迭代改进效果显著。

Artificial Intelligence (AI) is revolutionizing how discoveries are made. AI is creating a new scientific paradigm with the acceleration of processes like data analysis, computation, and idea generation. Researchers want to create a system that eventually learns to bypass humans completely by completing the research cycle without human involvement. Such developments could raise productivity and bring people closer to tough challenges.

The process of hypothesis generation, execution of experiments, and data validation often proves inefficient as scientific research involves human elements. Innovative solutions are hindered from evolutionary progress since ideas cannot be perfected with iterative feedback mechanisms during experimentation. The importance of such an aspect cannot be overstated as it contributes towards quicker and more accurate findings in scientific studies.

Several research environments have been developed to automate the research process partially. Tools such as GPT-researcher and AI-Scientist can break tasks into simpler subtasks, help generate ideas, and perform some form of computation. An overall integrated framework, however, does not exist, including experimental feedback within the research cycle. Moreover, most tools today rely on small datasets or pre-defined workflows, limiting their ability to execute open-ended research tasks.

Fudan University and the Shanghai Artificial Intelligence Laboratory have developed DOLPHIN, a closed-loop auto-research framework covering the entire scientific research process. The system generates ideas, executes experiments, and incorporates feedback to refine subsequent iterations. DOLPHIN ensures higher efficiency and accuracy by ranking task-specific literature and employing advanced debugging processes. This comprehensive approach distinguishes it from other tools and positions it as a pioneering system for autonomous research.

The methodology of DOLPHIN is divided into three interconnected stages. First, the system retrieves and ranks relevant research papers on a topic. The papers are ranked based on relevance to the task and topic attributes, thus filtering out the most applicable references. Using the selected references, DOLPHIN generates novel and independent research ideas. The generated ideas are refined by using a sentence-transformer model, calculating cosine similarity, and removing redundancy.

Once ideas are finalized, DOLPHIN transitions to experimental verification. It automatically generates and debugs code using an exception-traceback-guided process. This involves analyzing error messages and their related code structure to make corrections efficiently. Experiments proceed iteratively, with results categorized as improvements, maintenance, or declines. Successful outcomes are incorporated into future cycles, enhancing idea generation quality over time.

DOLPHIN was tested on three benchmark tasks: image classification using CIFAR-100, 3D point classification with ModelNet40, and sentiment classification using SST-2. In image classification, DOLPHIN improved baseline models like WideResNet by up to 0.8%, achieving a top-1 accuracy of 82.0%. For 3D point classification, the system outperformed human-designed methods such as PointNet, achieving an overall accuracy of 93.9%—a 2.9% improvement over baseline models. In sentiment classification, DOLPHIN improved accuracy by 1.5% to close the gap between BERT-base and BERT-large performance. These results show that DOLPHIN can produce ideas on par with state-of-the-art methods, including its performance on diverse datasets and tasks.

An interesting feature of DOLPHIN is that it improves efficiency across research iterations. At iteration one, it produced 20 ideas, of which 19 were judged novel, at an average cost per idea of $0.184. DOLPHIN’s closed-loop system improved processing through the third iteration to enhance idea quality and experimental execution rates. The success rate of debugging went from 33.3% to 50.0% after structured feedback was incorporated on earlier errors. This iterative improvement underscores the robustness of DOLPHIN’s design in automating and optimizing the research process.

DOLPHIN represents a significant leap forward in AI-driven research by addressing key inefficiencies in traditional scientific workflows. Its ability to integrate literature review, idea generation, experimentation, and feedback into a seamless cycle demonstrates its potential for advancing scientific discovery. The framework improves efficiency and achieves results comparable to or exceeding those of human-designed systems. This positions DOLPHIN as a promising tool for addressing complex scientific challenges and fostering innovation in various domains.


Check out the Paper and Project Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 65k+ ML SubReddit.

FREE UPCOMING AI WEBINAR (JAN 15, 2025): Boost LLM Accuracy with Synthetic Data and Evaluation IntelligenceJoin this webinar to gain actionable insights into boosting LLM model performance and accuracy while safeguarding data privacy.

The post Researchers from Fudan University and Shanghai AI Lab Introduces DOLPHIN: A Closed-Loop Framework for Automating Scientific Research with Iterative Feedback appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

DOLPHIN 科学研究 自动框架
相关文章