MarkTechPost@AI 2024年07月19日
Mistral AI and NVIDIA Collaborate to Release Mistral NeMo: A 12B Open Language Model Featuring 128k Context Window, Multilingual Capabilities, and Tekken Tokenizer
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Mistral AI 与英伟达合作发布了 Mistral NeMo,这是一个 120 亿参数的开源语言模型,它具有 128,000 个词元的上下文窗口、多语言功能和 Tekken 标记器。该模型经过训练可以执行函数调用,并且在英语、法语、德语、西班牙语、意大利语、葡萄牙语、中文、日语、韩语、阿拉伯语和印地语等多种主要语言中表现出色。Tekken 标记器是基于 Tiktoken 训练的,在压缩自然语言文本和源代码方面比其前身更有效。Mistral NeMo 还经过指令微调,使其能够更好地遵循精确的指令、有效地推理、处理多轮对话并生成准确的代码。

🤔 Mistral NeMo 是一个 120 亿参数的开源语言模型,由 Mistral AI 与英伟达合作开发。它具有 128,000 个词元的上下文窗口,能够处理大量的文本数据,并在多语言能力方面表现出色。该模型支持英语、法语、德语、西班牙语、意大利语、葡萄牙语、中文、日语、韩语、阿拉伯语和印地语等多种主要语言,使其成为全球应用的强大工具。

🤖 Mistral NeMo 采用 Tekken 标记器,该标记器基于 Tiktoken 训练,比其前身更有效地压缩自然语言文本和源代码。Tekken 在压缩源代码和多种主要语言方面效率提高了约 30%,并且在压缩文本方面优于 Llama 3 标记器,涵盖了约 85% 的所有语言。

🚀 Mistral NeMo 经过指令微调,使其能够更好地遵循精确的指令、有效地推理、处理多轮对话并生成准确的代码。这些改进对于需要高度交互和准确性的应用程序至关重要,例如客户服务机器人、编码助手和交互式教育工具。

💪 Mistral NeMo 的性能经过严格评估,与其他领先模型相比,它始终表现出更高的准确性和效率。模型的权重托管在 HuggingFace 上,供开发者和研究人员使用。Mistral NeMo 还可通过 Mistral 推理进行访问,并使用 Mistral 微调进行调整,为各种用例提供灵活的选择。

🤝 Mistral AI 与英伟达的合作展示了联合努力在推动技术进步和使尖端人工智能技术更广泛地应用方面的潜力。Mistral NeMo 的发布将为研究人员和企业提供一个强大的工具,帮助他们开发和部署更先进的人工智能解决方案。

In collaboration with NVIDIA, the Mistral AI team has unveiled Mistral NeMo, a groundbreaking 12-billion parameter model that promises to set new standards in artificial intelligence. Released under the Apache 2.0 license, Mistral NeMo is designed to be a high-performance, multilingual model capable of handling a context window of up to 128,000 tokens. This extensive context length is a significant advancement, allowing the model to process and understand large amounts of data more efficiently than its predecessors. The team has released two variants:

    Mistral-Nemo-Instruct-2407Mistral-Nemo-Base-2407

Mistral NeMo stands out for its exceptional reasoning abilities, extensive world knowledge, and high coding accuracy, making it the top performer in its size category. Its architecture is based on standard designs, ensuring it can be easily integrated into any system currently using Mistral 7B. This seamless compatibility is expected to facilitate widespread adoption among researchers and enterprises seeking to leverage cutting-edge AI technology.

The Mistral AI team has released both pre-trained base and instruction-tuned checkpoints. These resources are intended to support the research community and industry professionals in their efforts to explore and implement advanced AI solutions. Mistral NeMo was developed with quantization awareness, enabling FP8 inference without any degradation in performance. This feature ensures the model operates efficiently even with lower precision data representations.

A key component of Mistral NeMo’s success is its multilingual capability, making it a versatile tool for global applications. The model has been trained in function calling and is particularly adept in several major languages, including English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi. This broad linguistic proficiency aims to democratize access to advanced AI technologies, enabling users from diverse linguistic backgrounds to benefit from its capabilities.

Introducing Tekken, a new tokenizer, further enhances Mistral NeMo’s performance. Based on Tiktoken, Tekken was trained in over 100 languages and is significantly more efficient at compressing natural language text and source code than its predecessors. For instance, it is approximately 30% more efficient at compressing source code and several major languages, and it outperforms the Llama 3 tokenizer in compressing text for about 85% of all languages. This increased efficiency is crucial for handling the vast data required for modern AI applications.

Mistral NeMo’s advanced instruction fine-tuning process distinguishes it from earlier models like Mistral 7B. The fine-tuning and alignment phases have significantly improved the model’s ability to follow precise instructions, reason effectively, handle multi-turn conversations, and generate accurate code. These enhancements are critical for applications requiring high interaction and accuracy, such as customer service bots, coding assistants, and interactive educational tools.

The performance of Mistral NeMo has been rigorously evaluated and compared with other leading models. It consistently demonstrates superior accuracy and efficiency, reinforcing its position as a state-of-the-art AI model. Weights for the base and instruction-tuned models are hosted on HuggingFace, making them readily available for developers and researchers. Additionally, Mistral NeMo can be accessed via Mistral Inference and adapted using Mistral Finetune, providing flexible options for various use cases.

Mistral NeMo is also integrated into NVIDIA’s NIM inference microservice, available through ai.nvidia.com. This integration highlights the collaborative effort between Mistral AI and NVIDIA to push the boundaries of AI technology and deliver robust, scalable solutions to the market.

In conclusion, the release of Mistral NeMo, with its advanced features, including extensive multilingual support, efficient data compression, and superior instruction-following capabilities, positions it as a powerful tool for researchers and enterprises. The collaboration between Mistral AI and NVIDIA exemplifies the potential of joint efforts in driving technological advancements and making cutting-edge AI accessible to a broader audience.


Weights are hosted on HuggingFace both for the Base and for the Instruct models. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter

Join our Telegram Channel and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 46k+ ML SubReddit

The post Mistral AI and NVIDIA Collaborate to Release Mistral NeMo: A 12B Open Language Model Featuring 128k Context Window, Multilingual Capabilities, and Tekken Tokenizer appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Mistral AI 英伟达 开源语言模型 多语言 上下文窗口 Tekken 标记器
相关文章