MarkTechPost@AI 07月29日 04:38
Creating a Knowledge Graph Using an LLM
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本教程演示如何利用大型语言模型(LLM)如GPT-4o-mini,结合Python和Mirascope库,从非结构化的医疗日志中提取信息,构建一个结构化的知识图谱。通过定义清晰的图谱模式(节点和边),LLM能够准确识别实体及其关系,将原始文本转化为可查询和可视化的数据。这种方法尤其适用于处理混乱、非结构化的数据,能更精确、更具上下文感知地提取信息。最后,通过可视化工具可以直观地展示患者的状况和各项指标之间的联系,为医疗决策提供支持。

💡 **LLM赋能知识图谱构建**:利用GPT-4o-mini等大型语言模型,能够高效、准确地从非结构化文本(如医疗日志)中提取实体和关系,构建知识图谱,克服了传统NLP方法的局限性,尤其在处理混乱数据时表现更佳。

📊 **定义清晰的图谱Schema**:通过Pydantic库定义知识图谱的结构,包括表示实体及其属性的“节点”(Node)和表示实体间关系的“边”(Edge)。这种结构化方法为信息提取奠定了基础,使得LLM能够按照预设规则进行映射。

📝 **从非结构化日志中提取信息**:以自然语言编写的患者日志为例,展示了如何将包含症状、事件和观察的文本输入LLM,LLM会根据预设的Prompt和Schema,识别出关键信息并将其转化为图谱的节点和边。

❓ **结构化数据的查询与分析**:一旦知识图谱构建完成,就可以利用LLM进行自然语言查询,从结构化的数据中回答关于患者健康状况、行为模式和潜在风险的问题,例如“Mary最近表现出的健康风险或担忧是什么?”。

📈 **可视化呈现加速理解**:利用matplotlib和networkx库将生成的知识图谱进行可视化,以直观的图表形式展示患者的各项指标、症状、事件及其相互关系,极大地提高了对患者整体健康状况的理解和洞察。

In this tutorial, we’ll show how to create a Knowledge Graph from an unstructured document using an LLM. While traditional NLP methods have been used for extracting entities and relationships, Large Language Models (LLMs) like GPT-4o-mini make this process more accurate and context-aware. LLMs are especially useful when working with messy, unstructured data. Using Python, Mirascope, and OpenAI’s GPT-4o-mini, we’ll build a simple knowledge graph from a sample medical log.

Installing the dependencies

!pip install "mirascope[openai]" matplotlib networkx 

OpenAI API Key

To get an OpenAI API key, visit https://platform.openai.com/settings/organization/api-keys and generate a new key. If you’re a new user, you may need to add billing details and make a minimum payment of $5 to activate API access. Check out the full Codes here.

import osfrom getpass import getpassos.environ["OPENAI_API_KEY"] = getpass('Enter OpenAI API Key: ')

Defining Graph Schema

Before we extract information, we need a structure to represent it. In this step, we define a simple schema for our Knowledge Graph using Pydantic. The schema includes:

Check out the full Codes here.

from pydantic import BaseModel, Fieldclass Edge(BaseModel):    source: str    target: str    relationship: strclass Node(BaseModel):    id: str    type: str    properties: dict | None = Noneclass KnowledgeGraph(BaseModel):    nodes: list[Node]    edges: list[Edge]

Defining the Patient Log

Now that we have a schema, let’s define the unstructured data we’ll use to generate our Knowledge Graph. Below is a sample patient log, written in natural language. It contains key events, symptoms, and observations related to a patient named Mary. Check out the full Codes here.

patient_log = """Mary called for help at 3:45 AM, reporting that she had fallen while going to the bathroom. This marks the second fall incident within a week. She complained of dizziness before the fall.Earlier in the day, Mary was observed wandering the hallway and appeared confused when asked basic questions. She was unable to recall the names of her medications and asked the same question multiple times.Mary skipped both lunch and dinner, stating she didn't feel hungry. When the nurse checked her room in the evening, Mary was lying in bed with mild bruising on her left arm and complained of hip pain.Vital signs taken at 9:00 PM showed slightly elevated blood pressure and a low-grade fever (99.8°F). Nurse also noted increased forgetfulness and possible signs of dehydration.This behavior is similar to previous episodes reported last month."""

Generating the Knowledge Graph

To transform unstructured patient logs into structured insights, we use an LLM-powered function that extracts a Knowledge Graph. Each patient entry is analyzed to identify entities (like people, symptoms, events) and their relationships (such as “reported”, “has symptom”).

The generate_kg function is decorated with @openai.call, leveraging the GPT-4o-mini model and the previously defined KnowledgeGraph schema. The prompt clearly instructs the model on how to map the log into nodes and edges. Check out the full Codes here.

from mirascope.core import openai, prompt_template@openai.call(model="gpt-4o-mini", response_model=KnowledgeGraph)@prompt_template(    """    SYSTEM:    Extract a knowledge graph from this patient log.    Use Nodes to represent people, symptoms, events, and observations.    Use Edges to represent relationships like "has symptom", "reported", "noted", etc.    The log:    {log_text}    Example:    Mary said help, I've fallen.    Node(id="Mary", type="Patient", properties={{}})    Node(id="Fall Incident 1", type="Event", properties={{"time": "3:45 AM"}})    Edge(source="Mary", target="Fall Incident 1", relationship="reported")    """)def generate_kg(log_text: str) -> openai.OpenAIDynamicConfig:    return {"log_text": log_text}kg = generate_kg(patient_log)print(kg)

Querying the graph

Once the KnowledgeGraph has been generated from the unstructured patient log, we can use it to answer medical or behavioral queries. We define a function run() that takes a natural language question and the structured graph, and passes them into a prompt for the LLM to interpret and respond. Check out the full Codes here.

@openai.call(model="gpt-4o-mini")@prompt_template(    """    SYSTEM:    Use the knowledge graph to answer the user's question.    Graph:    {knowledge_graph}    USER:    {question}    """)def run(question: str, knowledge_graph: KnowledgeGraph): ...
question = "What health risks or concerns does Mary exhibit based on her recent behavior and vitals?"print(run(question, kg))

Visualizing the Graph

At last, we use render_graph(kg) to generate a clear and interactive visual representation of the knowledge graph, helping us better understand the patient’s condition and the connections between observed symptoms, behaviors, and medical concerns.

import matplotlib.pyplot as pltimport networkx as nxdef render_graph(kg: KnowledgeGraph):    G = nx.DiGraph()    for node in kg.nodes:        G.add_node(node.id, label=node.type, **(node.properties or {}))    for edge in kg.edges:        G.add_edge(edge.source, edge.target, label=edge.relationship)    plt.figure(figsize=(15, 10))    pos = nx.spring_layout(G)    nx.draw_networkx_nodes(G, pos, node_size=2000, node_color="lightgreen")    nx.draw_networkx_edges(G, pos, arrowstyle="->", arrowsize=20)    nx.draw_networkx_labels(G, pos, font_size=12, font_weight="bold")    edge_labels = nx.get_edge_attributes(G, "label")    nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_color="blue")    plt.title("Healthcare Knowledge Graph", fontsize=15)    plt.show()render_graph(kg)

Check out the Codes. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

The post Creating a Knowledge Graph Using an LLM appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

知识图谱 大型语言模型 LLM 医疗日志 数据提取
相关文章