MarkTechPost@AI 2024年08月06日
Allen Institute for AI (AI2) Released a New Bundle of OLMo 1B and 7B Assets
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Allen Institute for Artificial Intelligence (AI2) 发布了名为 OLMo(Open Language Model)的开源语言模型框架,旨在推动语言模型领域的研究发展。该框架提供了全面的数据、训练代码、模型和评估工具,为研究人员和学者们提供了进行 AI 合作研究的便利。初始版本包括多个 7B 参数模型和一个 1B 参数模型,所有模型都在至少 2 万亿个词元上进行过训练。

💥 **OLMo 框架旨在赋能 AI 社区探索更广泛的研究问题。** 它允许研究人员调查特定预训练数据集对下游性能的影响,并探索新的预训练方法。这种开放式方法可以帮助更深入地理解语言模型及其潜在的不稳定性,为 AI 科学的集体进步做出贡献。

🚀 **每个 OLMo 模型都附带一组资源,包括完整的训练数据、模型权重、训练代码、日志和指标。** 该框架还提供 500 多个基本模型的检查点,以及 7B 模型的适应版本(OLMo-7B-Instruct 和 OLMo-7B-SFT)、评估代码和微调功能。所有组件都在 Apache 2.0 许可下发布,确保研究社区能够广泛访问。

📊 **AI2 对 OLMo 进行了基准测试,并与其他开放和部分开放模型进行了比较,包括 EleutherAI 的 Pythia Suite、MosaicML 的 MPT 模型、TII 的 Falcon 模型和 Meta 的 Llama 系列。** 评估结果表明,OLMo 7B 与流行模型(如 Llama 2)具有竞争力,在许多生成和阅读理解任务中表现出相当的性能,但在一些问答任务中略微落后。

🗓️ **AI2 为 OLMo 和相关工具实施了一个结构化的发布流程。** 定期更新和新资产发布将通过模板化的发布说明进行传播,这些说明将在社交媒体、AI2 网站和新闻通讯中分享。这种方法确保用户能够了解 OLMo 生态系统(包括 Dolma 和其他相关工具)的最新发展。

📈 **2024 年 7 月发布的 OLMo 对 1B 和 7B 模型都带来了重大改进。** OLMo 1B 2024 年 7 月版本在 HellaSwag 中提高了 4.4 分,在其他评估指标方面也有所改进,这得益于 Dolma 数据集的增强版本和分阶段训练。类似地,OLMo 7B 2024 年 7 月版本利用了最新的 Dolma 数据集,并采用了双阶段课程,始终如一地增加了 2-3 分的性能提升。

🏆 **OLMo 的发布仅仅是 AI2 对开源语言模型的宏伟计划的开始。** AI2 正在对各种模型尺寸、模态、数据集、安全措施和评估进行研究,以完善 OLMo 家族。AI2 旨在与 AI 社区合作,共同打造世界上最好的开源语言模型,并邀请 AI 社区参与这一创新举措。

💡 **OLMo 提供了更完整、更强大的工具,为语言模型的研究和应用提供了更多可能性。**

The Allen Institute for Artificial Intelligence AI2 has taken a significant step in advancing open-source language models with the launch of OLMo (Open Language Model). This framework provides researchers and academics with comprehensive access to data, training code, models, and evaluation tools, fostering collaborative research in the field of AI. The initial release includes multiple variants of 7B-parameter models and a 1B-parameter model, all trained on at least 2 trillion tokens.

The OLMo framework is designed to empower the AI community to explore a wider range of research questions. It allows for investigating the impact of specific pretraining data subsets on downstream performance and exploring new pretraining methods. This open approach enables a deeper understanding of language models and their potential instabilities, contributing to the collective advancement of AI science.

Each OLMo model comes with a suite of resources, including full training data, model weights, training code, logs, and metrics. The framework also provides 500+ checkpoints per base model, adapted versions of the 7B model (OLMo-7B-Instruct and OLMo-7B-SFT), evaluation code, and fine-tuning capabilities. All components are released under the Apache 2.0 License, ensuring broad accessibility for the research community.

In developing OLMo, AI2 benchmarked against other open and partially open models, including EleutherAI’s Pythia Suite, MosaicML’s MPT models, TII’s Falcon models, and Meta’s Llama series. The evaluation results show that OLMo 7B is competitive with popular models like Llama 2, demonstrating comparable performance on many generative and reading comprehension tasks, while slightly lagging in some question-answering tasks.

AI2 has implemented a structured release process for OLMo and associated tools. Regular updates and new asset roll-outs are communicated through templated release notes shared on social media, the AI2 website, and via newsletter. This approach ensures that users stay informed about the latest developments in the OLMo ecosystem, including Dolma and other related tools.

The July 2024 release of OLMo brought significant improvements to both the 1B and 7B models. OLMo 1B July 2024 showed a 4.4-point increase in HellaSwag, among other evaluation improvements, thanks to an enhanced version of the Dolma dataset and staged training. Similarly, OLMo 7B July 2024 utilized the newest Dolma dataset and employed a two-staged curriculum, consistently adding 2-3 points of performance improvements.

Earlier releases, such as OLMo 7B April 2024 (formerly OLMo 7B 1.7), featured extended context length from 2048 to 4096 tokens and training on the Dolma 1.7 dataset. This version outperformed Llama 2-7B on MMLU and approached Llama 2-13B’s performance, even surpassing it on GSM8K. These incremental improvements demonstrate AI2’s commitment to continually enhancing the OLMo framework and models.

The OLMo release marks just the beginning of AI2’s ambitious plans for open language models. Work is already underway on various model sizes, modalities, datasets, safety measures, and evaluations for the OLMo family. AI2 aims to collaboratively build the world’s best open language model, inviting the AI community to participate in this innovative initiative.

In a nutshell, AI2 has launched OLMo, an open-source language model framework, providing researchers with comprehensive access to data, code, and evaluation tools. The initial release includes 7B and 1B parameter models trained on 2+ trillion tokens. OLMo aims to foster collaborative AI research, offering resources like full training data, model weights, and 500+ checkpoints per base model. Benchmarked against other open models, OLMo 7B shows competitive performance. AI2 has implemented a structured release process, with recent updates bringing significant improvements. This initiative marks the beginning of AI2’s ambitious plans to collaboratively build the world’s best open language model.


Check out the Details, OLMo 1B July 2024, OLMo 7B July 2024, OLMo 7B July 2024 SFT, OLMo 7B July 2024 Instruct. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

Don’t Forget to join our 47k+ ML SubReddit

Find Upcoming AI Webinars here


The post Allen Institute for AI (AI2) Released a New Bundle of OLMo 1B and 7B Assets appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

OLMo 开源语言模型 AI2 AI研究
相关文章