OpenAI Cookbook 06月25日 15:08
No Title
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了如何生成高质量的合成医疗记录数据集,并着重于在数据中引入各种可能出现的错误。通过模拟过敏反应、用药与病史不符、以及化验结果与诊断不符等情况,旨在提升数据集的真实性和实用性。文章详细介绍了数据生成的流程,并提供了识别和纠正数据错误的策略,以帮助用户更好地理解和利用医疗数据。

🩺 数据集构建: 医疗记录数据集包含患者ID、出生日期、性别、病史、当前用药、过敏史、化验结果、诊断、治疗方案、有效性以及问题描述等多个关键字段。

⚠️ 错误类型: 数据集中模拟了多种现实世界中可能出现的错误,包括过敏反应、用药与病史不符、化验结果与诊断不符等。例如,给对青霉素过敏的患者开具青霉素处方,或者糖尿病患者未接受糖尿病药物治疗。

🔍 错误识别与纠正: 对于无效数据行,'Is Valid'字段标记为False,'Issue'字段详细解释了数据中的问题所在,帮助用户快速定位和纠正错误。通过这种方式,可以提高数据集的质量和可靠性。

def generate_data():    messages = [        {            "role": "user",            "content": """You are a helpful assistant designed to generate data. You will be given a format for the data to generate and some examples of the data.When generating Patient IDs, use the format 'P' followed by a three-digit number (e.g., P006, P941, P319).Intentionally make some mistakes in the data generation and document them in the appropriate columns ('Is Valid' and 'Issue') if the row of data is invalid.The types of mistakes to include are:- **Allergy Contradictions**: Prescribing a medication that the patient is allergic to (e.g., prescribing Penicillin to a patient allergic to Penicillin).- **Medical History and Medication Mismatch**: A patient with a medical condition not receiving appropriate medication (e.g., a diabetic patient not prescribed any diabetes medication).- **Lab Results and Diagnosis Mismatch**: Lab results that do not support the diagnosis (e.g., normal glucose levels but diagnosed with Diabetes Type 2).- **Other Plausible Mistakes**: Any other realistic errors that could occur in medical records, such as incorrect gender entries, impossible dates of birth, or inconsistent treatment plans.Ensure that when 'Is Valid' is 'False', the 'Issue' column clearly explains the problem.Return 100 rows of data for the user. Your response should strictly be in the format of a valid CSV.Generate Synthetic Medical Records Dataset with the following columns:    - Patient ID: A randomly generated patient id    - Date of Birth: Date of birth of the patient    - Gender: M/F    - Medical History: Past diagnoses    - Current Medications: Medication the patient is taking    - Allergies: Identified allergies    - Lab Results (Glucose mg/dL)    - Diagnoses: Current diagnosis    - Treatment Plan: Current treatment plan    - Is Valid: Whether or not the current row of data is valid (True/False)    - Issue: If the row of data is not valid, what the issue isPatient ID,Date of Birth,Gender,Medical History,Current Medications,Allergies,Lab Results (Glucose mg/dL),Diagnoses,Treatment Plan,Is Valid,IssueP001,1980-05-14,M,Hypertension,Lisinopril,None,110,Hypertension,Continue Lisinopril,True,P002,1975-11-30,F,Diabetes Type 2,Metformin,Penicillin,90,Diabetes Type 2,Continue Metformin,True,P003,1990-07-22,F,Asthma,Albuterol,Aspirin,85,Asthma,Prescribe Albuterol,True,P004,2000-03-10,M,None,Amoxicillin,Penicillin,95,Infection,Prescribe Amoxicillin,False,Prescribed Amoxicillin despite Penicillin allergyP005,1985-09-18,F,Hyperlipidemia,Atorvastatin,None,200,Hyperlipidemia,Continue Atorvastatin,True,P006,1978-12-05,M,Hypertension; Diabetes Type 2,Lisinopril; Insulin,None,55,Diabetes Type 2,Adjust insulin dosage,False,Low glucose level not properly addressed            """        }    ]    response = client.chat.completions.create(        model=MODEL,        messages=messages    )    return response.choices[0].message.content.replace('```csv', '').replace('```', '')

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

医疗记录 数据生成 数据质量 错误检测
相关文章