TechCrunch News 2024年11月20日
Converge Bio’s ‘everything store’ for biotech LLMs brings in $5.5M seed
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Converge Bio开发了一款工具,帮助生物科技和制药公司利用大型语言模型(LLM)进行生物学研究。该工具通过数据增强、模型微调和可解释性等功能,将LLM应用于抗体研究等领域,提高药物研发的效率。Converge Bio获得了550万美元种子轮融资,将用于扩大产品规模和团队建设,并计划发表关于抗体设计的科学论文。该公司希望成为生物科技领域生成式AI的“一站式商店”,并通过建立信任关系,拓展更多应用场景,例如疫苗设计等。

🤔**数据增强:**Converge Bio的工具可以增强抗体LLM的数据,例如添加抗原-抗体和蛋白质-蛋白质相互作用等相关信息,使其更适用于特定研究领域。

🧬**模型微调:**利用公司内部的专有数据,对增强后的LLM进行微调,使其专注于特定的抗原目标,例如提高抗体结合亲和力。

💡**可解释性:**该工具提供可解释性功能,帮助研究人员深入了解模型的输出结果,例如识别哪些氨基酸或碱基对导致了更好的结果,从而促进领域专家理解和应用。

🚀**生成新序列:**基于模型的理解和可解释性,生成具有改进效果的新序列,并提供可解释性,加速药物研发过程。

💰**融资及未来规划:**Converge Bio已获得550万美元种子轮融资,将用于扩大团队、获取客户以及开发更强大的基础模型,并计划发表关于抗体设计的科学论文。

AI is finding its way into every corner of biotech and pharmaceutical research, but like other industries, it’s never quite as straightforward to implement as one would like. Converge Bio has built a tool for companies to make their biology-focused LLMs actually work, from “enriching” their data to explaining their answers. The company has raised $5.5 million in a seed round to scale its product.

“A model is just a model. It’s not enough,” said CEO and co-founder Dov Gertz. “A pipeline has to be made so companies can actually use the model in their own R&D process. The market is very fragmented, but pharma and biotech want to consume this technology in a consolidated way, in one place. We want to be that place.”

If you’re not a machine learning engineer working in drug discovery, this may not be a familiar problem to you. But basically, there are powerful foundational models out there, large language models trained not on books and the internet but on huge databases of DNA, protein structures, and genomics.

These are powerful and versatile models, but like the LLMs used in products like ChatGPT and Cursor, they require a lot of work to hammer into a shape that people can actually use day to day. That work is especially difficult in specialized domains like microbiology or immunology. Taking a “raw” LLM trained on billions of protein sequences and making it something a lab tech can use as part of their normal research is a non-trivial problem.

As an example, Gertz suggested antibody research. An LLM trained on antibody-specific biology exists, but it’s very general. Converge Bio offers a series of improvements that can be done securely and using a company’s own IP.

From left: Converge Bio’s Iddo Weiner, Chief Scientific Officer; Dov Gertz, CEO; Oded Kalev, CTO. Image Credits:Omer Hacohen / Converge Bio

First is “data enrichment,” augmenting the antibody LLM with important related data like antigen-antibody and protein-protein interactions. Then, loaded with more specific knowledge, it can be fine-tuned on the specific antigen the team is looking to target, and which they may have proprietary in-dish data on.

“Now we have an application: The input is a sequence, the output is binding affinity,” Gertz said. Then the platform provides another important layer: explainability. Researchers can drill down on the output to find out not just that “this sequence works better than this” but locate down to the amino acid or base pair level what part of the sequence seems to be making it work better.

Lastly, it generates new sequences that provide improved outcomes, likewise with explainability. Gertz noted that the explainability has surprised them with its popularity among customers — makes sense, since it allows experts to apply their domain expertise (say, protein interactions) to this newer and more obscure region of bioinformatics and machine learning.

Image Credits:Converge Bio

Converge uses the many open source and free foundation models out there, but is also working on making its own. It already has a proprietary process, Gertz said, for the explainability part. And the data enrichment “curriculum” is entirely theirs as well — not a trivial process. Training methodologies, he pointed out, are one of a few closely guarded secrets by the most successful AI companies.

That’s part of the moat they’re hoping to build, along with the fact that. As Gertz put it, “This is probably the biggest opportunity in biotech in five decades.”

Yet many, perhaps most, biotech companies don’t have a dedicated solution for doing LLM-related work in their field, and actively pursuing niches that generalist solutions don’t apply to.

“The idea is to be the everything store for genAI in biotech, then use that as a wedge to offer more over time,” Gertz said. “The behavior in pharma and bio is, once they have ties to a vendor that they trust, they want to use them in other use cases, be it antibody design or vaccine design. That’s why I think this positioning is best for this moment in the market.”

Investors seem to agree, putting $5.5 million into a seed round led by TLV partners.

The company will be using the money to hire up and acquire customers, as startups often do at this stage, but will also be publishing a scientific paper on antibody design (using its own systems, of course) and training “a proper foundation model.”

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

生物科技 人工智能 大型语言模型 药物研发 抗体设计
相关文章