TechCrunch News 05月01日 00:56
JetBrains releases Mellum, an ‘open’ AI coding model
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

JetBrains公司发布了其首个用于代码生成的“开放”AI模型Mellum。该模型拥有40亿参数,基于超过4万亿tokens的数据进行训练,专为代码补全而设计。Mellum适用于集成到专业开发者工具、AI编码助手以及代码理解和生成的研究中,同时也适用于教育应用和微调实验。JetBrains使用GitHub上许可的代码和英文维基百科文章等数据集训练了Mellum,并在256个H200 Nvidia GPU集群上花费了约20天。虽然Mellum需要进行微调才能使用,但JetBrains强调其Python微调模型仅用于评估潜在能力,不适用于生产环境。AI生成代码带来了新的安全挑战,超过50%的组织在使用AI生成代码时会遇到安全问题。

🤖 JetBrains发布了Mellum,一个拥有40亿参数的代码生成AI模型,该模型在超过4万亿tokens的数据上进行了训练,专门用于代码补全任务。

📚 Mellum的训练数据包括来自GitHub的许可代码和英文维基百科文章,训练过程在256个H200 Nvidia GPU集群上耗时约20天。

⚠️ JetBrains提醒说,Mellum可能会反映公共代码库中存在的偏差,并且其代码建议不一定是安全的或没有漏洞的。因此,用户在使用时需要注意潜在的安全风险。

👨‍💻 Mellum的设计目标是集成到专业的开发者工具中,例如集成开发环境中的智能代码建议,以及用于AI驱动的编码助手和代码理解与生成的学术研究。它也适用于教育应用和微调实验。

JetBrains, the company behind a range of popular app development tools, has released its first “open” AI model for coding.

On Wednesday, JetBrains made Mellum, a code-generating model the company released for its various software development suites last year, openly available on the AI dev platform Hugging Face. Mellum, trained on more than 4 trillion tokens, weighs in at 4 billion parameters, and is designed specifically for code completion (i.e. completing code snippets based on the surrounding context).

Parameters roughly correspond to a model’s problem-solving skills, while tokens are the raw bits of data that a model processes. A million tokens roughly corresponds to 30,000 lines of code.

“Designed for integration into professional developer tooling (e.g. intelligent code suggestions in integrated developer environments), AI-powered coding assistants, and research on code understanding and generation, Mellum is also well-suited for educational applications and fine-tuning experiments,” explains JetBrains in a technical report.

JetBrains says that it trained Mellum, which is Apache 2.0-licensed, on a collection of data sets including permissively licensed code from GitHub and English-language Wikipedia articles. Training took around 20 days on a cluster of 256 H200 Nvidia GPUs.

Mellum takes some work to get up and running. The base model can’t be used out of the box; it has to be fine-tuned first. While JetBrians has provided a few Mellum models fine-tuned for Python, the company cautions that they’re meant for “estimation about potential capabilities” — not deploying into a production environment.

AI-generated code is no doubt changing how software is built, but it’s also introducing new security challenges. More than 50% of organizations encounter security issues with AI-produced code sometimes or frequently, according to a late 2023 survey by developer security platform Synk.

Techcrunch event

Berkeley, CA | June 5

BOOK NOW

Indeed, JetBrains notes that Mellum may “reflect biases present in public codebases” (e.g. generating code similar in style to open source repositories), and that its code suggestions won’t necessarily be “secure or free of vulnerabilities.”

“This is just the beginning,” JetBrains wrote in a blog post. “We’re not chasing generality — we’re building focus. If Mellum sparks even one meaningful experiment, contribution, or collaboration, we would consider it a win.”

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

JetBrains Mellum AI模型 代码生成 代码安全
相关文章