TechCrunch News 05月15日 00:06
Stability AI releases an audio-generating model that can run on smartphones
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Stability AI发布了Stable Audio Open Small,一款据称是市场上最快的“立体声”音频生成AI模型,并且效率足够高,可以在智能手机上运行。该模型由Stability AI与芯片制造商Arm合作开发,专为快速生成短音频样本和音效而设计,可在不到8秒的时间内在智能手机上生成长达11秒的音频。该模型的训练集完全由免版税音频库中的歌曲组成,但仅支持英文提示,且无法生成逼真的人声或高质量歌曲。对于年收入超过100万美元的开发者和组织,需要购买Stability的企业许可证。

🚀Stable Audio Open Small由Stability AI和Arm合作开发,针对Arm CPU进行了优化,参数为3.41亿,旨在智能手机上快速生成短音频样本和音效。

🎵该模型训练集全部来自Free Music Archive和Freesound的免版税音频库,避免了版权风险,这与Suno和Udio等其他音频生成AI模型的训练集不同。

📱Stable Audio Open Small可在不到8秒的时间内在智能手机上生成长达11秒的音频,但仅支持英文提示,且无法生成逼真的人声或高质量歌曲,且受限于西方音乐风格的数据偏见。

💰该模型的使用条款略有限制,研究人员、爱好者以及年收入低于100万美元的企业可以免费使用,但年收入超过100万美元的开发者和组织需要购买Stability的企业许可证。

AI startup Stability AI has released Stable Audio Open Small, a “stereo” audio-generating AI model that the company claims is the fastest on the market — and efficient enough to run on smartphones.

Stable Audio Open Small is the fruit of a collaboration between Stability AI and Arm, the chipmaker that produces many of the processors inside tablets, phones, and other mobile devices. While a number of AI-powered apps can generate audio, like Suno and Udio, most rely on cloud processing, meaning that they can’t be used offline.

Stability also claims that Stable Audio Open Small’s training set is made up entirely of songs from the royalty-free audio libraries Free Music Archive and Freesound. That’s as opposed to the training sets of the aforementioned Suno and Udio, which reportedly contain copyrighted content, posing an IP risk.

Stable Audio Open Small is 341 million parameters in size and optimized to run on Arm CPUs. (Parameters, sometimes referred to as weights, are the internal components of a model that guide its behavior.) Designed for quickly generating short audio samples and sound effects (e.g., drum and instrument riffs), Stable Audio Open Small can produce up to 11 seconds of audio on a smartphone in less than 8 seconds, claims Stability AI.

Here’s a sample generated by Stable Audio Open Small:

And here’s another one:

The model isn’t without its limitations. Stable Audio Open Small only supports prompts written in English, and Stability notes in its documentation that the model can’t generate realistic vocals or high-quality songs. The model also doesn’t perform equally well across musical styles, Stability warns — a consequence of its Western-biased training data.

Techcrunch event

Berkeley, CA | June 5

REGISTER NOW

In another potential wrinkle for devs, Stable Audio Open Small has somewhat restrictive usage terms. It’s free to use for researchers, hobbyists, and businesses with less than $1 million in annual revenue, but developers and organizations making over $1 million in revenue have to pay for Stability’s enterprise license.

Stability, the beleaguered firm behind the popular image generation model Stable Diffusionraised new cash last year as investors, including Eric Schmidt and Napster founder Sean Parker, sought to turn the business around. Emad Mostaque, Stability’s co-founder and ex-CEO, reportedly mismanaged Stability into financial ruin, leading staff to resign, a partnership with Canva to fall through, and investors to grow concerned about the company’s prospects.

In the last few months, Stability has hired a new CEO, appointed Titanic director James Cameron to its board of directors, and released several new image generation models.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Stability AI Stable Audio Open Small 音频生成 AI模型 Arm
相关文章