MarkTechPost@AI 07月14日 21:51
Liquid AI Open-Sources LFM2: A New Generation of Edge LLMs
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Liquid AI 发布了其第二代 Liquid Foundation Models (LFM2),这标志着边缘计算领域的一次重大飞跃。LFM2 系列生成式 AI 模型专为边缘设备部署设计,在保持竞争力的质量标准的同时,实现了前所未有的性能优化。新模型在推理速度和训练效率上都有显著提升,为各种设备带来了毫秒级延迟、离线运行和数据隐私保护能力,推动了边缘AI的普及和应用。

🚀 LFM2 在边缘 AI 领域树立了新标杆,实现了多方面的显著效率提升。与 Qwen3 在 CPU 架构上相比,模型解码和预填充速度提高了 2 倍,对实时应用至关重要。

⚙️ LFM2 采用了创新的混合架构,融合了卷积和注意力机制的优势。它由 16 个模块组成,包括 10 个双门控短程卷积块和 6 个分组查询注意力 (GQA) 块。这种混合方法源于 Liquid AI 在 Liquid Time-constant Networks (LTCs) 方面的开创性工作。

📏 LFM2 提供了三种不同大小的配置:350M、700M 和 1.2B 参数,每种配置都针对不同的部署场景进行了优化,同时保持了核心效率优势。所有模型都经过了 10 万亿个 token 的训练,这些 token 来自精心策划的预训练语料库,其中约 75% 是英语,20% 是多语言内容,5% 是代码数据。

🥇 评估结果表明,LFM2 在多个基准类别中显著优于类似大小的模型。LFM2-1.2B 模型与 Qwen3-1.7B 的性能相当,尽管参数减少了 47%。LFM2-700M 优于 Gemma 3 1B IT,而最小的 LFM2-350M 检查点与 Qwen3-0.6B 和 Llama 3.2 1B Instruct 保持竞争力。

What is included in this article:
Performance breakthroughs – 2x faster inference and 3x faster training
Technical architecture – Hybrid design with convolution and attention blocks
Model specifications – Three size variants (350M, 700M, 1.2B parameters)
Benchmark results – Superior performance compared to similar-sized models
Deployment optimization – Edge-focused design for various hardware
Open-source accessibility – Apache 2.0-based licensing
Market implications – Impact on edge AI adoption

The landscape of on-device artificial intelligence has taken a significant leap forward with Liquid AI’s release of LFM2, their second-generation Liquid Foundation Models. This new series of generative AI models represents a paradigm shift in edge computing, delivering unprecedented performance optimizations specifically designed for on-device deployment while maintaining competitive quality standards.

Revolutionary Performance Gains

LFM2 establishes new benchmarks in the edge AI space by achieving remarkable efficiency improvements across multiple dimensions. The models deliver 2x faster decode and prefill performance compared to Qwen3 on CPU architectures, a critical advancement for real-time applications. Perhaps more impressively, the training process itself has been optimized to achieve 3x faster training compared to the previous LFM generation, making LFM2 the most cost-effective path to building capable, general-purpose AI systems.

These performance improvements are not merely incremental but represent a fundamental breakthrough in making powerful AI accessible on resource-constrained devices. The models are specifically engineered to unlock millisecond latency, offline resilience, and data-sovereign privacy – capabilities essential for phones, laptops, cars, robots, wearables, satellites, and other endpoints that must reason in real time.

Hybrid Architecture Innovation

The technical foundation of LFM2 lies in its novel hybrid architecture that combines the best aspects of convolution and attention mechanisms. The model employs a sophisticated 16-block structure consisting of 10 double-gated short-range convolution blocks and 6 blocks of grouped query attention (GQA). This hybrid approach draws from Liquid AI’s pioneering work on Liquid Time-constant Networks (LTCs), which introduced continuous-time recurrent neural networks with linear dynamical systems modulated by nonlinear input interlinked gates.

At the core of this architecture is the Linear Input-Varying (LIV) operator framework, which enables weights to be generated on-the-fly from the input they are acting on. This allows convolutions, recurrences, attention, and other structured layers to fall under one unified, input-aware framework. The LFM2 convolution blocks implement multiplicative gates and short convolutions, creating linear first-order systems that converge to zero after a finite time.

The architecture selection process utilized STAR, Liquid AI’s neural architecture search engine, which was modified to evaluate language modeling capabilities beyond traditional validation loss and perplexity metrics. Instead, it employs a comprehensive suite of over 50 internal evaluations that assess diverse capabilities including knowledge recall, multi-hop reasoning, understanding of low-resource languages, instruction following, and tool use.

Comprehensive Model Lineup

LFM2 is available in three strategically sized configurations: 350M, 700M, and 1.2B parameters, each optimized for different deployment scenarios while maintaining the core efficiency benefits. All models were trained on 10 trillion tokens drawn from a carefully curated pre-training corpus comprising approximately 75% English, 20% multilingual content, and 5% code data sourced from web and licensed materials.

The training methodology incorporates knowledge distillation using the existing LFM1-7B as a teacher model, with cross-entropy between LFM2’s student outputs and the teacher outputs serving as the primary training signal throughout the entire 10T token training process. The context length was extended to 32k during pretraining, enabling the models to handle longer sequences effectively.

Superior Benchmark Performance

Evaluation results demonstrate that LFM2 significantly outperforms similarly-sized models across multiple benchmark categories. The LFM2-1.2B model performs competitively with Qwen3-1.7B despite having 47% fewer parameters. Similarly, LFM2-700M outperforms Gemma 3 1B IT, while the smallest LFM2-350M checkpoint remains competitive with Qwen3-0.6B and Llama 3.2 1B Instruct.

Beyond automated benchmarks, LFM2 demonstrates superior conversational capabilities in multi-turn dialogues. Using the WildChat dataset and LLM-as-a-Judge evaluation framework, LFM2-1.2B showed significant preference advantages over Llama 3.2 1B Instruct and Gemma 3 1B IT while matching Qwen3-1.7B performance despite being substantially smaller and faster.

Edge-Optimized Deployment

The models excel in real-world deployment scenarios, having been exported to multiple inference frameworks including PyTorch’s ExecuTorch and the open-source llama.cpp library. Testing on target hardware including Samsung Galaxy S24 Ultra and AMD Ryzen platforms demonstrates that LFM2 dominates the Pareto frontier for both prefill and decode inference speed relative to model size.

The strong CPU performance translates effectively to accelerators such as GPU and NPU after kernel optimization, making LFM2 suitable for a wide range of hardware configurations. This flexibility is crucial for the diverse ecosystem of edge devices that require on-device AI capabilities.

Conclusion

The release of LFM2 addresses a critical gap in the AI deployment landscape where the shift from cloud-based to edge-based inference is accelerating. By enabling millisecond latency, offline operation, and data-sovereign privacy, LFM2 unlocks new possibilities for AI integration across consumer electronics, robotics, smart appliances, finance, e-commerce, and education sectors.

The technical achievements represented in LFM2 signal a maturation of edge AI technology, where the trade-offs between model capability and deployment efficiency are being successfully optimized. As enterprises pivot from cloud LLMs to cost-efficient, fast, private, and on-premises intelligence, LFM2 positions itself as a foundational technology for the next generation of AI-powered devices and applications.

Check out the Technical Details and Model on Hugging Face. All credit for this research goes to the researchers of this project. ‘Your AI deserves a smarter stage. Ours reaches 1M minds a month. Put it on Marktechpost

The post Liquid AI Open-Sources LFM2: A New Generation of Edge LLMs appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Liquid AI LFM2 边缘AI 大模型
相关文章