Unite.AI 01月07日
Google is Making AI Training 28% Faster by Using SLMs as Teachers
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

谷歌推出SALT方法,使用较小AI模型作为教师,使大语言模型训练更高效。该方法包括知识蒸馏和自监督学习两个阶段,能减少训练时间、提高性能,降低AI开发门槛,可能改变AI发展格局。

🎓SALT方法采用两阶段训练,小模型先当教师分享知识

💪大模型随后进行独立学习,掌握复杂模式与任务

📈训练时间减少28%,性能显著提升

🌐降低AI开发门槛,更多组织可参与

Training large language models (LLMs) has become out of reach for most organizations. With costs running into millions and compute requirements that would make a supercomputer sweat, AI development has remained locked behind the doors of tech giants. But Google just flipped this story on its head with an approach so simple it makes you wonder why no one thought of it sooner: using smaller AI models as teachers.

How SALT works: A new approach to training AI models

In a recent research paper titled “A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs,” Google Research and DeepMind introduced SALT (Small model Aided Large model Training). This is the novel method challenging our traditional approach to training LLMs.

Why is this research significant? Currently, training large AI models is like trying to teach someone everything they need to know about a subject all at once – it is inefficient, expensive, and often restricted to organizations with massive computing resources. SALT takes a different path, introducing a two-stage training process that is both innovative and practical.

Breaking down how SALT actually works:

Stage 1: Knowledge Distillation

Stage 2: Self-Supervised Learning

In non-technical terms, imagine the smaller AI model is like a helpful tutor who guides the larger model in the beginning stages of training. This tutor provides extra information along with their answers, indicating how confident they are about each answer. This extra information, known as the “soft labels,” helps the larger model learn more quickly and effectively.

Now, as the larger AI model becomes more capable, it needs to transition from relying on the tutor to learning independently. This is where “linear decay” and “linear ratio decay” come into play.
Think of these techniques as gradually reducing the tutor's influence over time:
The goal of both techniques is to ensure a smooth transition for the larger AI model, preventing any sudden changes in its learning behavior. 

The results are compelling. When Google researchers tested SALT using a 1.5 billion parameter SLM to train a 2.8 billion parameter LLM on the Pile dataset, they saw:

But what makes SALT truly innovative is its theoretical framework. The researchers discovered that even a “weaker” teacher model can enhance the student's performance by achieving what they call a “favorable bias-variance trade-off.” In simpler terms, the smaller model helps the larger one learn fundamental patterns more efficiently, creating a stronger foundation for advanced learning.

Why SALT could reshape the AI development playing field

Remember when cloud computing transformed who could start a tech company? SALT might just do the same for AI development.

I have been following AI training innovations for years, and most breakthroughs have mainly benefited the tech giants. But SALT is different.

Here is what it could mean for the future:

For Organizations with Limited Resources:

For the AI Development Landscape:

What this means for the future

By using small models as teachers, we are not just making AI training more efficient – we are also fundamentally changing who gets to participate in AI development. The implications go far beyond just technical improvements.

Key takeaways to keep in mind:

What to watch for:

  1. Keep an eye on smaller organizations starting to develop custom AI models
  2. Watch for new applications in fields that previously could not afford AI development
  3. Look for innovations in how smaller models are used for specialized tasks

Remember: The real value of SALT is in how it might reshape who gets to innovate in AI. Whether you are running a research lab, managing a tech team, or just interested in AI development, this is the kind of breakthrough that could make your next big idea possible.

Maybe start thinking about that AI project you thought was out of reach. It might be more possible than you imagined.

The post Google is Making AI Training 28% Faster by Using SLMs as Teachers appeared first on Unite.AI.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

谷歌 SALT AI训练 模型
相关文章