cs.AI updates on arXiv.org 13小时前
TRIALSCOPE: A Unifying Causal Framework for Scaling Real-World Evidence Generation with Biomedical Language Models
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了一种名为TRIALSCOPE的框架,用于从大规模观察性数据中提取高质量真实世界证据(RWE)。该框架利用生物医学语言模型对临床文本进行结构化,采用先进的概率模型进行去噪和插补,并引入因果推断技术以解决治疗效应估计中的混杂因素。实验结果表明,TRIALSCOPE能够自动整理高质量结构化患者数据,减少混杂因素,并生成与随机对照试验相当的结果。

arXiv:2311.01301v3 Announce Type: replace-cross Abstract: The rapid digitization of real-world data presents an unprecedented opportunity to optimize healthcare delivery and accelerate biomedical discovery. However, these data are often found in unstructured forms such as clinical notes in electronic medical records (EMRs), and is typically plagued by confounders, making it challenging to generate robust real-world evidence (RWE). Therefore, we present TRIALSCOPE, a framework designed to distil RWE from population level observational data at scale. TRIALSCOPE leverages biomedical language models to structure clinical text at scale, employs advanced probabilistic modeling for denoising and imputation, and incorporates state-of-the-art causal inference techniques to address common confounders in treatment effect estimation. Extensive experiments were conducted on a large-scale dataset of over one million cancer patients from a single large healthcare network in the United States. TRIALSCOPE was shown to automatically curate high-quality structured patient data, expanding the dataset and incorporating key patient attributes only available in unstructured form. The framework reduces confounding in treatment effect estimation, generating comparable results to randomized controlled lung cancer trials. Additionally, we demonstrate simulations of unconducted clinical trials - including a pancreatic cancer trial with varying eligibility criteria - using a suite of validation tests to ensure robustness. Thorough ablation studies were conducted to better understand key components of TRIALSCOPE and establish best practices for RWE generation from EMRs. TRIALSCOPE was able to extract data cancer treatment data from EMRs, overcoming limitations of manual curation. We were also able to show that TRIALSCOPE could reproduce results of lung and pancreatic cancer clinical trials from the extracted real world data.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

TRIALSCOPE 真实世界证据 EMR 生物医学语言模型 因果推断
相关文章