Nvidia Blog 02月20日
Massive Foundation Model for Biomolecular Sciences Now Available via NVIDIA BioNeMo
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Evo 2是一款强大的新型基础模型,由Arc Institute和斯坦福大学合作,在NVIDIA DGX Cloud平台上构建,是目前最大的公开基因组数据AI模型。它基于NVIDIA BioNeMo平台向全球开发者开放,并提供NVIDIA NIM微服务,方便安全地部署AI。Evo 2接受了近9万亿个核苷酸的训练,可应用于生物分子研究,包括预测蛋白质的结构和功能、识别新的医疗和工业分子,以及评估基因突变的影响。该模型能处理长达100万个token的遗传信息序列,为医疗、农业和材料科学等领域带来新的洞见。

🧬Evo 2是目前最大的公开基因组数据AI模型,由Arc Institute和斯坦福大学合作构建,基于NVIDIA DGX Cloud平台,为全球开发者提供强大的基因组分析能力。

🔬Evo 2接受了近9万亿个核苷酸的训练,可以应用于多种生物分子研究,包括预测蛋白质结构、识别新分子以及评估基因突变的影响。

🧑‍🔬Evo 2在医疗健康领域潜力巨大,能够帮助研究人员理解与特定疾病相关的基因变异,并设计靶向这些区域的新型分子以治疗疾病。例如,在乳腺癌相关基因BRCA1的测试中,Evo 2能够以90%的准确率预测先前未识别的突变是否会影响基因功能。

🌱在农业领域,Evo 2可以帮助科学家开发更具气候适应性或营养更丰富的作物新品种,从而应对全球粮食短缺问题。此外,Evo 2还可以应用于设计生物燃料或工程蛋白质,以分解石油或塑料等。

🔭Evo 2采用了一种新型模型架构,能够处理长达100万个token的遗传信息序列,这有助于科学家们理解生物体遗传密码中遥远部分之间的联系,以及细胞功能、基因表达和疾病的机制。

Scientists everywhere can now access Evo 2, a powerful new foundation model that understands the genetic code for all domains of life. Unveiled today as the largest publicly available AI model for genomic data, it was built on the NVIDIA DGX Cloud platform in a collaboration led by nonprofit biomedical research organization Arc Institute and Stanford University.

Evo 2 is available to global developers on the NVIDIA BioNeMo platform, including as an NVIDIA NIM microservice for easy, secure AI deployment.

Trained on an enormous dataset of nearly 9 trillion nucleotides — the building blocks of DNA and RNA — Evo 2 can be applied to biomolecular research applications including predicting the form and function of proteins based on their genetic sequence, identifying novel molecules for healthcare and industrial applications, and evaluating how gene mutations affect their function.

“Evo 2 represents a major milestone for generative genomics,” said Patrick Hsu, Arc Institute cofounder and core investigator, and an assistant professor of bioengineering at the University of California, Berkeley. “By advancing our understanding of these fundamental building blocks of life, we can pursue solutions in healthcare and environmental science that are unimaginable today.”

The NVIDIA NIM microservice for Evo 2 enables users to generate a variety of biological sequences, with settings to adjust model parameters. Developers interested in fine-tuning Evo 2 on their proprietary datasets can download the model through the open-source NVIDIA BioNeMo Framework, a collection of accelerated computing tools for biomolecular research.

“Designing new biology has traditionally been a laborious, unpredictable and artisanal process,” said Brian Hie, assistant professor of chemical engineering at Stanford University, the Dieter Schwarz Foundation Stanford Data Science Faculty Fellow and an Arc Institute innovation investigator. “With Evo 2, we make biological design of complex systems more accessible to researchers, enabling the creation of new and beneficial advances in a fraction of the time it would previously have taken.”

Enabling Complex Scientific Research

Established in 2021 with $650 million from its founding donors, Arc Institute empowers researchers to tackle long-term scientific challenges by providing scientists with multiyear funding — letting scientists focus on innovative research instead of grant writing.

Its core investigators receive state-of-the-art lab space and funding for eight-year, renewable terms that can be held concurrently with faculty appointments with one of the institute’s university partners, which include Stanford University, the University of California, Berkeley, and the University of California, San Francisco.

By combining this unique research environment with accelerated computing expertise and resources from NVIDIA, Arc Institute’s researchers can pursue more complex projects, analyze larger datasets and more quickly achieve results. Its scientists are focused on disease areas including cancer, immune dysfunction and neurodegeneration.

NVIDIA accelerated the Evo 2 project by giving scientists access to 2,000 NVIDIA H100 GPUs via NVIDIA DGX Cloud on AWS. DGX Cloud provides short-term access to large compute clusters, giving researchers the flexibility to innovate. The fully managed AI platform includes NVIDIA BioNeMo, which features optimized software in the form of NVIDIA NIM microservices and NVIDIA BioNeMo Blueprints.

NVIDIA researchers and engineers also collaborated closely on AI scaling and optimization.

Applications Across Biomolecular Sciences 

Evo 2 can provide insights into DNA, RNA and proteins. Trained on a wide array of species across domains of life — including plants, animals and bacteria — the model can be applied to scientific fields such as healthcare, agricultural biotechnology and materials science.

Evo 2 uses a novel model architecture that can process lengthy sequences of genetic information, up to 1 million tokens. This widened view into the genome could unlock scientists’ understanding of the connection between distant parts of an organism’s genetic code and the mechanics of cell function, gene expression and disease.

“A single human gene contains thousands of nucleotides — so for an AI model to analyze how such complex biological systems work, it needs to process the largest possible portion of a genetic sequence at once,” said Hsu.

In healthcare and drug discovery, Evo 2 could help researchers understand which gene variants are tied to a specific disease — and design novel molecules that precisely target those areas to treat the disease. For example, researchers from Stanford and the Arc Institute found that in tests with BRCA1, a gene associated with breast cancer, Evo 2 could predict with 90% accuracy whether previously unrecognized mutations would affect gene function.

In agriculture, the model could help tackle global food shortages by providing insights into plant biology and helping scientists develop varieties of crops that are more climate-resilient or more nutrient-dense. And in other scientific fields, Evo 2 could be applied to design biofuels or engineer proteins that break down oil or plastic.

“Deploying a model like Evo 2 is like sending a powerful new telescope out to the farthest reaches of the universe,” said Dave Burke, Arc’s chief technology officer. “We know there’s immense opportunity for exploration, but we don’t yet know what we’re going to discover.”

Read more about Evo 2 in Arc’s technical report.

See notice regarding software product information.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Evo 2 基因组数据 人工智能 生物分子研究
相关文章