MarkTechPost@AI 2024年12月23日
Viro3D: A Comprehensive Resource of Predicted Viral Protein Structures Unveils Evolutionary Insights and Functional Annotations
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

该研究利用机器学习技术,预测了4400种人类和动物病毒的超过17万个蛋白结构,使得病毒蛋白结构覆盖率提高了30倍。研究人员开发了Viro3D数据库,方便用户访问和探索这些结构数据,从而深入了解病毒的进化关系和功能。通过结构分析,揭示了包括冠状病毒刺突蛋白在内的关键病毒蛋白的起源和功能,为抗病毒药物和疫苗的开发提供了新的视角。该研究显著扩展了病毒结构信息,为病毒学研究提供了宝贵资源。

🧬 病毒结构数据显著扩充: 通过机器学习预测了大量病毒蛋白结构,使结构覆盖率提高了30倍,填补了病毒学研究中长期存在的结构数据缺失的空白。

🔬 Viro3D数据库发布: 为了支持病毒学研究,研究人员开发了Viro3D数据库,用户可以方便地搜索、浏览和下载病毒蛋白模型,并探索病毒物种之间的结构相似性。

🦠 进化关系与功能解析: 研究深入分析了病毒蛋白的进化关系,特别是I类膜融合糖蛋白,包括冠状病毒刺突蛋白的起源,为理解病毒的传播和致病机理提供了重要线索。

💡 结构分析揭示病毒特性: 通过结构聚类,将病毒多样性简化为约19000种独特的结构,并发现许多结构位于病毒基因组末端,暗示了进化热点。研究还发现,病毒蛋白在细胞生物中通常缺乏同源物,表明其经历了广泛的重塑。

Viruses infect organisms across all domains of life, playing key roles in ecological processes such as ocean biogeochemical cycles and the regulation of microbial populations while also causing diseases in humans, animals, and plants. Viruses are Earth’s most abundant biological entities, characterized by rapid evolution, high mutation rates, and frequent genetic exchanges with hosts and other viruses. This constant genetic flux leads to highly diverse genomes with mosaic architectures, challenging functional annotation, evolutionary analysis, and taxonomic classification. Viruses have likely emerged multiple times throughout history despite their diversity, with some lineages predating the last universal common ancestor (LUCA). This highlights a longstanding co-evolutionary relationship between viruses and cellular organisms.

Protein structures, more conserved than sequences, offer a reliable means to study evolutionary relationships and infer gene functions in viruses. However, viral protein structures are significantly underrepresented in public databases, with less than 10% of the Protein Data Bank (PDB) comprising experimental viral protein structures. Recent advances in machine learning, such as AlphaFold2 and ESMFold, have enabled accurate protein structure prediction at scale. Using these tools, researchers have generated a comprehensive dataset of 85,000 predicted structures from 4,400 human and animal viruses, significantly expanding structural coverage. These efforts address the historical gap in viral protein representation, facilitating functional annotation and phylogenetic analysis and shedding light on the evolutionary history of critical viral proteins like class-I fusion glycoproteins.

Researchers from the MRC-University of Glasgow Centre for Virus Research and the University of Tokyo generated 170,000 predicted protein structures from 4,400 animal viruses using ColabFold and ESMFold. They evaluated model quality, performed structural analyses, and explored deep phylogenetic relationships, particularly focusing on class-I membrane fusion glycoproteins, including the origins of coronavirus spike proteins. To support the virology community, they developed Viro3D, an accessible database where users can search, browse, and download viral protein models and explore structural similarities across virus species. This resource aims to advance molecular virology, virus evolution studies, and the design of therapies and vaccines.

The study utilized 6,721 GenBank nucleotide accession numbers, covering 4,407 virus isolates and 3,106 species with host annotations, to extract 71,269 viral protein records. Additional annotations included 4,070 mature peptides, 11,786 protein regions, and 253 polyproteins. Protein structures were predicted using ColabFold and ESMFold, with structural coverage evaluated against the PDB. Proteins were clustered based on sequence and structural similarity, forming 19,067 structural clusters. Functional annotations were expanded using sequence-based and structural networks. A structural similarity map of viral species was created, and comparisons were made with other viral structure databases, highlighting the dataset’s comprehensiveness and structural insights.

The study introduced Viro3D, a robust database encompassing over 170,000 predicted 3D protein structures from 4,400 animal viruses. Using ColabFold and ESMFold, researchers achieved a significant 30-fold increase in structural coverage compared to experimental data. Notably, this dataset revealed functional and evolutionary insights, including the evolutionary origins of coronavirus spike proteins. Structural analyses and protein-protein interaction networks supported functional annotations. Viro3D’s predictions showed high reliability when benchmarked against experimentally solved viral structures. Viro3D provides an unprecedented resource for studying viral evolution, protein function, and structural mechanisms, offering potential applications in antiviral drug and vaccine development.

In conclusion, the study expanded viral protein structural coverage 30-fold by modeling 85,000 proteins from 4,400 human and animal viruses, with 64% of models being highly confident. Combining ColabFold and ESMFold methods enhanced efficiency, accuracy, and speed. Structural clustering reduced viral diversity to 19,000 distinct structures, 65% unique to this dataset, with many found near viral genome ends, suggesting evolutionary hotspots. Analysis revealed that viral proteins often lack homologs in cellular organisms, indicating extensive remodeling. The study traced their evolution by exploring class-I fusion glycoproteins, highlighting their role in virus transmission and pathogenesis, and offering valuable insights for virology research.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 60k+ ML SubReddit.

Trending: LG AI Research Releases EXAONE 3.5: Three Open-Source Bilingual Frontier AI-level Models Delivering Unmatched Instruction Following and Long Context Understanding for Global Leadership in Generative AI Excellence….

The post Viro3D: A Comprehensive Resource of Predicted Viral Protein Structures Unveils Evolutionary Insights and Functional Annotations appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

病毒 蛋白质结构 Viro3D 进化 机器学习
相关文章