MarkTechPost@AI 03月12日
Reka AI Open Sourced Reka Flash 3: A 21B General-Purpose Reasoning Model that was Trained from Scratch
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Reka AI 推出了 Reka Flash 3,这是一个拥有 210 亿参数的推理模型,专为通用对话、代码支持、指令跟随甚至函数调用而构建。该模型通过结合公开和合成数据集进行训练,并采用 REINFORCE Leave One-Out (RLOO) 方法进行指令调优和强化学习。Reka Flash 3 具有处理高达 32k tokens 上下文长度的能力,并结合了通过指定 标签的“预算强制”机制,使用户能够限制模型的思考过程到一定数量的步骤,从而确保一致的性能。同时,Reka Flash 3 也非常适合设备部署,通过 4-bit 量化可压缩至 11GB。

💡Reka Flash 3 是一个拥有 210 亿参数的推理模型,从头构建,适用于通用对话、代码支持、指令跟随和函数调用,旨在为各种应用提供实用的基础。

⏱️Reka Flash 3 能够处理高达 32k tokens 的上下文长度,便于处理冗长的文档和复杂的任务,而不会造成过度的压力。它还结合了通过指定 标签的“预算强制”机制,使用户能够限制模型的思考过程到一定数量的步骤,从而确保一致的性能。

💽Reka Flash 3 非常适合设备部署,完整精度大小为 39GB (fp16),可以通过 4-bit 量化进一步压缩到 11GB。与更大、资源更密集的模型相比,这种灵活性允许更流畅的本地部署。

🌐Reka Flash 3 的多语言能力在 WMT’23 上获得了 83.2 COMET 分数,表明尽管主要关注英语,但它对非英语输入提供了合理的水平的支持。

In today’s dynamic AI landscape, developers and organizations face several practical challenges. High computational demands, latency issues, and limited access to truly adaptable open-source models often constrain progress. Many existing solutions require expensive cloud infrastructures or are too large for on-device applications, leaving a gap for models that are both efficient and flexible. Addressing these challenges is key to enabling more accessible, custom AI solutions that can be tailored for diverse applications without overburdening resources .

Reka AI has introduced Reka Flash 3—a reasoning model built from the ground up with 21 billion parameters. Designed for general conversation, coding support, instruction following, and even function calling, this model is crafted to serve as a practical foundation for a wide variety of applications. The training process incorporates a mix of publicly accessible and synthetic datasets, followed by careful instruction tuning and reinforcement learning using REINFORCE Leave One-Out (RLOO) methods. This deliberate approach aims to strike a balance between capability and efficiency, positioning Reka Flash 3 as a sensible choice among its peers .

From a technical standpoint, Reka Flash 3 offers several features that make it both versatile and resource-efficient. One notable aspect is its ability to handle a context length of up to 32k tokens, which facilitates the processing of lengthy documents and complex tasks without undue strain. The model also incorporates a “budget forcing” mechanism through designated <reasoning> tags. This feature enables users to limit the model’s thinking process to a set number of steps, thereby ensuring consistent performance without excessive computational overhead. Moreover, Reka Flash 3 is well-suited for on-device deployments, offering a full precision size of 39GB (fp16) that can be further compressed to 11GB via 4-bit quantization. Such flexibility allows for smoother, local deployments when compared to larger, more resource-intensive models .

Evaluation metrics and performance data reinforce the model’s practicality. For example, while Reka Flash 3 shows a modest MMLU-Pro score of 65.0, it remains competitive when paired with supplementary knowledge sources like web search. Additionally, its multilingual capabilities are reflected in an 83.2 COMET score on WMT’23, indicating a reasonable level of support for non-English inputs despite its primary focus on English. These results, combined with its efficient parameter count relative to peers such as QwQ-32B, highlight its potential for a range of real-world applications without resorting to overblown claims .

In summary, Reka Flash 3 represents a thoughtful step toward more accessible AI solutions. By carefully balancing performance with efficiency, it provides a robust yet adaptable model suitable for general chat, coding, and instruction tasks. Its compact design, enhanced by a 32k token context window and innovative budget forcing mechanism, makes it a practical option for on-device deployments and low-latency applications. For researchers and developers looking for a model that is both capable and manageable, Reka Flash 3 offers a promising foundation that aligns with practical needs without excessive fanfare.


Check out the Model on Hugging Face, and Technical details. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 80k+ ML SubReddit.

Meet Parlant: An LLM-first conversational AI framework designed to provide developers with the control and precision they need over their AI customer service agents, utilizing behavioral guidelines and runtime supervision. It’s operated using an easy-to-use CLI and native client SDKs in Python and TypeScript .

The post Reka AI Open Sourced Reka Flash 3: A 21B General-Purpose Reasoning Model that was Trained from Scratch appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Reka Flash 3 AI模型 推理模型 自然语言处理
相关文章