MarkTechPost@AI 07月09日 09:08
Hugging Face Releases SmolLM3: A 3B Long-Context, Multilingual Reasoning Model
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Hugging Face 推出了 SmolLM3,一款新型的“Smol”语言模型,旨在通过紧凑的30亿参数架构提供强大的多语言长上下文推理能力。尽管大多数高上下文能力的模型通常超过70亿参数,但SmolLM3凭借更少的参数实现了与最先进模型相媲美的性能,使其更具成本效益,并易于在受限硬件上部署,同时在工具使用、多步推理和语言多样性方面保持优势。该模型支持长达128k个token的序列,并在多种基准测试中表现出色,尤其是在多语言问答和数学推理方面。SmolLM3的发布为需要经济高效且功能强大的语言模型的应用场景提供了新的选择,例如聊天机器人、文档摘要和边缘部署。

🌍 SmolLM3 是一个紧凑型、多语言、双模式长上下文语言模型,能够处理长达128k个token的序列,这对于需要理解长文档、日志或结构化记录的任务至关重要。

🛠️ 该模型支持双模式推理,包括用于聊天和工具增强任务的指令遵循,以及用于多语言任务的多语言问答和生成。这使得SmolLM3 能够同时胜任开放式生成和结构化推理,适用于各种应用。

🗣️ SmolLM3 具备多语言能力,支持英语、法语、西班牙语、德语、意大利语和葡萄牙语。它在XQuAD和MGSM等基准测试中表现出色,展现了其跨语言边界的泛化能力。

⚖️ 尽管只有30亿参数,但SmolLM3在多个下游任务中实现了与Mistral-7B等更大模型相近甚至更好的性能。这得益于其训练数据的规模和质量,以及精细的架构调整。

⚙️ 该模型在工具调用任务中表现出色,能够遵循基于模式的输入输出约束,并与需要确定性行为的系统(如自主代理和API驱动的环境)良好交互。

Hugging Face just released SmolLM3, the latest version of its “Smol” language models, designed to deliver strong multilingual reasoning over long contexts using a compact 3B-parameter architecture. While most high-context capable models typically push beyond 7B parameters, SmolLM3 manages to offer state-of-the-art (SoTA) performance with significantly fewer parameters—making it more cost-efficient and deployable on constrained hardware, without compromising on capabilities like tool usage, multi-step reasoning, and language diversity.

Overview of SmolLM3

SmolLM3 stands out as a compact, multilingual, and dual-mode long-context language model capable of handling sequences up to 128k tokens. It was trained on 11 trillion tokens, positioning it competitively against models like Mistral, LLaMA 2, and Falcon. Despite its size, SmolLM3 achieves surprisingly strong tool usage performance and few-shot reasoning ability—traits more commonly associated with models double or triple its size.

SmolLM3 was released in two variants:

Both models are publicly available under the Apache 2.0 license on Hugging Face’s Model Hub.

Key Features

1. Long Context Reasoning (up to 128k tokens)
SmolLM3 utilizes a modified attention mechanism to efficiently process extremely long contexts—up to 128,000 tokens. This capability is crucial for tasks involving extended documents, logs, or structured records where context length directly affects comprehension and accuracy.

2. Dual Mode Reasoning
The instruction-tuned SmolLM3-3B supports dual-mode reasoning:

This bifurcation allows the model to excel in both open-ended generation and structured reasoning, making it suitable for applications ranging from RAG pipelines to agent workflows.

3. Multilingual Capabilities
Trained on a multilingual corpus, SmolLM3 supports six languages: English, French, Spanish, German, Italian, and Portuguese. It performs well on benchmarks like XQuAD and MGSM, demonstrating its ability to generalize across linguistic boundaries with minimal performance drop.

4. Compact Size with SoTA Performance
At just 3 billion parameters, SmolLM3 achieves performance close to or on par with larger models such as Mistral-7B on multiple downstream tasks. This is made possible by the scale and quality of its training data (11T tokens) and careful architectural tuning.

5. Tool Use and Structured Outputs
The model demonstrates impressive performance on tool-calling tasks—both in prompt-based workflows and with structured outputs. It correctly follows schema-driven input-output constraints and interfaces well with systems requiring deterministic behavior, such as autonomous agents and API-driven environments.

Technical Training Details

SmolLM3 was trained on an internal mixture curated by Hugging Face, consisting of high-quality web content, code, academic papers, and multilingual sources. The 11T-token training run was done using multi-node distributed training strategies on GPU clusters, employing optimizations like Flash Attention v2 for efficient long-sequence training. The tokenizer is a 128k-token SentencePiece model, shared across all supported languages.

For long context support, Hugging Face employed linear and grouped attention mechanisms that minimize quadratic complexity while retaining performance. This enabled the model to handle context lengths up to 128k during both training and inference—without memory bottlenecks that plague dense transformers at this scale.

The SmolLM3-3B instruction-tuned variant was further trained using Hugging Face’s trlx library for alignment with chat instructions, reasoning tasks, and tool usage demonstrations.

Performance Benchmarks

SmolLM3 performs strongly on multiple multilingual and reasoning benchmarks:

While it does not surpass the latest 7B and 13B models on every benchmark, SmolLM3’s performance-to-parameter ratio remains one of the highest in its class.

Use Cases and Applications

SmolLM3 is particularly suited for:

Conclusion

SmolLM3 exemplifies a new generation of small-yet-capable language models. Its combination of multilingual support, long-context handling, and strong reasoning—all within a 3B parameter footprint—marks a significant step forward in model efficiency and accessibility. Hugging Face’s release demonstrates that with the right training recipe and architectural design, smaller models can still deliver robust performance in complex tasks traditionally reserved for much larger LLMs.


Check out the SmolLM3-3B-Base and SmolLM3-3B-Instruct. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter, and Youtube and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

The post Hugging Face Releases SmolLM3: A 3B Long-Context, Multilingual Reasoning Model appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

SmolLM3 Hugging Face 多语言 长上下文
相关文章