ΑΙhub 07月29日 18:13
Open-source Swiss language model to be released this summer
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

瑞士EPFL和ETH Zurich即将发布一款基于公共基础设施开发的大型语言模型(LLM),标志着开源AI和多语言能力的新里程碑。该模型在“Alps”超级计算机上训练,旨在提供透明、多语言且广泛可及的AI解决方案。与许多商业模型不同,它将完全开源,包括源代码和模型权重,并强调数据透明性和可复现性。该模型支持超过1000种语言,有80亿和700亿参数两个版本,旨在满足不同用户的需求,并遵循瑞士和欧盟的数据保护及版权法规。此举有望推动AI研究的创新与问责,并促进全球范围内的合作与应用。

🌟 **完全开源与透明性**:该LLM将以完全开源的方式发布,包括源代码、模型权重,并保证训练数据的透明度和可复现性。这一策略旨在促进科学、政府、教育和私营部门的广泛采用,并鼓励创新和问责制。透明的流程也为监管合规提供了支持。

🌍 **千语种能力与全球适用性**:该模型在设计上就强调了其大规模多语言能力,能够处理超过1000种语言。训练数据涵盖了1500多种语言的文本,以及代码和数学数据,确保了模型在全球范围内的最高适用性,打破语言壁垒。

🚀 **多版本与高性能**:模型将发布80亿和700亿参数两个版本,以满足不同用户的需求,其中700亿参数版本将跻身全球最强大的完全开源模型之列。通过在超过15万亿个高质量训练token上进行训练,模型实现了高可靠性,能够进行深入的语言理解和多样化的应用。

🛡️ **负责任的数据实践与合规性**:在开发过程中,模型严格遵守瑞士数据保护法、瑞士版权法以及欧盟AI法案的透明度义务。研究表明,在数据获取过程中尊重网络爬取选择退出,对大多数日常任务和通用知识获取几乎不会造成性能下降,体现了对用户隐私和数据合规的重视。

💡 **超级计算赋能主权AI与开放创新**:模型在瑞士国家超级计算中心(CSCS)的“Alps”超级计算机上训练,该平台拥有超过10,000个NVIDIA Grace Hopper Superchips,并使用100%碳中和电力。这得益于与NVIDIA和HPE/Cray长达15年的合作,展示了公共研究机构与行业领导者合作,驱动主权基础设施和开放创新的力量,不仅造福瑞士,也惠及全球科学与社会。

Wes Cockx & Google DeepMind / AI large language models / Licenced by CC-BY 4.0

By Melissa Anchisi and Florian Meyer

This summer, EPFL and ETH Zurich will release a large language model (LLM) developed on public infrastructure. Trained on the “Alps” supercomputer at the Swiss National Supercomputing Centre (CSCS), the new LLM marks a milestone in open-source AI and multilingual excellence.

Earlier this month in Geneva, around 50 leading global initiatives and organisations dedicated to open-source LLMs and trustworthy AI convened at the International Open-Source LLM Builders Summit. Hosted by the AI centres of EPFL and ETH Zurich, the event marked a significant step in building a vibrant and collaborative international ecosystem for open foundation models. Open LLMs are increasingly viewed as credible alternatives to commercial systems, most of which are developed behind closed doors in the United States or China.

Participants of the summit previewed the forthcoming release of a fully open, publicly developed LLM — co-created by researchers at EPFL, ETH Zurich and other Swiss universities in close collaboration with engineers at CSCS. Currently in final testing, the model will be downloadable under an open license. The model focuses on transparency, multilingual performance, and broad accessibility.

The model will be fully open: source code, and weights will be publicly available, and the training data will be transparent and reproducible, supporting adoption across science, government, education, and the private sector. This approach is designed to foster both innovation and accountability.

“Fully open models enable high-trust applications and are necessary for advancing research about the risks and opportunities of AI. Transparent processes also enable regulatory compliance,” says Imanol Schlag, research scientist at the ETH AI Center, who is leading the effort alongside EPFL AI Center faculty members and professors Antoine Bosselut and Martin Jaggi.

Multilingual by design

A distinctive characteristic of the LLM is its capability in over 1000 languages. “We have emphasised making the models massively multilingual from the start,” says Antoine Bosselut.

Training of the base model was done on a large text dataset in over 1500 languages—approximately 60% English and 40% non-English languages— as well as code and mathematics data. Given the representation of content from all languages and cultures, the resulting model maintains the highest global applicability.

Designed for scale and inclusion

The model will be released in two sizes—8 billion and 70 billion parameters, meeting a broad range of users’ needs. The 70B version will rank among the most powerful fully open models worldwide. The number of parameters reflects a model’s capacity to learn and generate complex responses.

High reliability is achieved through training on over 15 trillion high-quality training tokens (units representing a word or part of the word), enabling robust language understanding and versatile use cases.

Responsible data practices

The LLM is being developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. In a recent study, the project leaders demonstrated that for most everyday tasks and general knowledge acquisition, respecting web crawling opt-outs during data acquisition produces virtually no performance degradation.

Supercomputer as an enabler of sovereign AI

The model is trained on the “Alps” supercomputer at CSCS in Lugano, one of the world’s most advanced AI platforms, equipped with over 10,000 NVIDIA Grace Hopper Superchips. The system’s scale and architecture made it possible to train the model efficiently using 100% carbon-neutral electricity.

The successful realisation of “Alps” was significantly facilitated by a long-standing collaboration spanning over 15 years with NVDIA and HPE/Cray. This partnership has been pivotal in shaping the capabilities of “Alps”, ensuring it meets the demanding requirements of large-scale AI workloads, including the pre-training of complex LLMs.

“Training this model is only possible because of our strategic investment in “Alps”, a supercomputer purpose-built for AI,” says Thomas Schulthess, Director of CSCS and professor at ETH Zurich. “Our enduring collaboration with NVIDIA and HPE exemplifies how joint efforts between public research institutions and industry leaders can drive sovereign infrastructure, fostering open innovation—not just for Switzerland, but for science and society worldwide.”

Public access and global reuse

In late summer, the LLM will be released under the Apache 2.0 License. Accompanying documentation will detail the model architecture, training methods, and usage guidelines to enable transparent reuse and further development.

“As scientists from public institutions, we aim to advance open models and enable organisations to build on them for their own applications”, says Antoine Bosselut.

“By embracing full openness—unlike commercial models that are developed behind closed doors—we hope that our approach will drive innovation in Switzerland, across Europe, and through multinational collaborations. Furthermore, it is a key factor in attracting and nurturing top talent,” says EPFL professor Martin Jaggi.

About the Swiss AI Initiative

Launched in December 2023 by EPFL and ETH Zurich, the Swiss AI Initiative is supported by more than 10 academic institutions across Switzerland. With over 800 researchers involved and access to over 20 million yearly GPU hours on CSCS’s supercomputer “Alps”, it stands as the world’s largest open science and open source effort dedicated to AI foundation models.

The Swiss AI Initiative is receiving financial support from the ETH Board — the strategic management and supervisory body of the ETH Domain (ETH, EPFL, PSI, WSL, Empa, Eawag) — for the period 2025 to 2028.

The Swiss AI Initiative is led by researchers from the ETH AI Center and the EPFL AI Centre, both of which serve as regional units of ELLIS (the European Laboratory for Learning and Intelligent Systems) — a pan-European AI network focused on fundamental research in trustworthy AI, technical innovation, and societal impact within Europe’s open societies.

About CSCS

The Swiss National Supercomputing Centre (CSCS) is a member and partner of the LUMI Consortium, granting Swiss scientist access to leading infrastructure in Kajaani, Finland. This aligns with CSCS’ strategy to scale out future, significantly larger extreme-scale computing infrastructures through multi-national collaborations, leveraging regions abundant in hydroelectric and cooling resources, positioning AI research and innovation to ensure global relevance and regional impact

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

开源大模型 瑞士AI 多语言AI AI基础设施 透明AI
相关文章