准确率飙升！Graph RAG如何利用知识图谱提升RAG答案质量（四）—

温馨提示

本分享为系列知识分享，预期分为五节（实战内容加餐一节）：

微软GraphRAG 代码实战

大文本数据下GraphRAG执行流程实战

本分享为第四小节 微软GraphRAG代码实战。本系列分享在CSDN、知乎、掘金和微信公众号收获超高点击，是笔者结合学习工作认真撰写的系列文章，也是目前公开资料中介绍GraphRAG 理论和实战系统较详尽的资料之一。大家读完感兴趣可订阅专栏，掌握GraphRAG的技术核心与实战指南~。

前言

前三篇文章我们花费大量篇幅讲解GraphRAG的理论知识，了解了GraphRAG是如何将文本构建为知识图谱并基于知识图谱进行检索查询的全流程，让大家直观感受到了GraphRAG相比传统RAG更全面更有逻辑性分析长文本关联的优点。

"理论是基础，代码实战才能真正服务我们工作”，打下坚实基础后，接下来我们将分享微软GraphRAG2.3版本的代码实战，在实操中掌握GraphRAG的基本用法。

本篇分享所有代码大家可以关注我的微信公众号大模型真好玩，并私信GraphRAG代码实战获得。

一、GraphRAG流程回顾

虽然前几篇文章已经花费大量篇幅介绍GraphRAG的基本流程，但为了加深大家印象，本文开头还是对GraphRAG的基本流程进行回顾。GraphRAG基本流程如下：

1.1 知识图谱构建(Indexing)过程

文本单元切分:

实体和关系提取:

图构建:

Leiden算法

社区报告:

大家如果对知识图谱构建过程还不熟悉的请参阅GraphRAG图谱构建详细步骤。

1.2 查询(Query)过程

知识图谱构建完成后，用户可以通过不同的搜索模式进行查询:

全局搜索:

局部搜索:

DRIFT搜索（本文新添加）:

大家如果对知识图谱构建过程还不熟悉的请参阅GraphRAG检索查询详细步骤。

1.3 提示词（Prompt）调优

GraphRAG无论是知识图谱构建还是基于图谱的检索查询，都离不开提示词和大模型的参与。为了获得最佳性能，GraphRAG强烈建议进行提示词（Prompt）调优，确保模型可以根据你的特定数据和查询需求进行优化，从而提供更准确和相关的答案。

GraphRAG以上三个阶段是一次图谱构建，多次索引查询，但每次新引进一些文档都要重新进行知识图谱构建（毕竟新文档会引入新实体和关系）。

二、GraphRAG安装使用

2.1 GraphRAG环境搭建

anaconda

metaGraph

graphrag

jupyter notebook

jupyter

conda create -n metaGraphRAG python=3.12 #创建虚拟环境 conda activate metaGraphRAG #激活虚拟环境conda install jupyterlab # 安装jupyterlabconda install ipykernel # 安装jupyter内核python -m ipykernel install --user --name metaGraphRAG --display-name "Python (metaGraphRAG)" # 设置内核名字为metaGraphRAG

以上命令执行完成后，在项目目录下命令行执行jupyter notebook，浏览器会自动打开jupyter 脚本，显示下图表示内核安装正确：

pip install graphrag

2.2 GraphRAG项目配置

创建好虚拟环境后，我们接下来使用graphrag构建知识图谱：

openl

input

mkdir -p ./openl/input

上传数据集，这次还是使用前几篇文章使用的案例进行演示。在openl/input文件夹中新建大数据时代.txt文件并写入我们文本内容：

《大数据时代》是一本由维克托·迈尔-舍恩伯格与肯尼斯·库克耶合著的书籍，讨论了如何在海量数据中挖掘出有价值的信息。这本书深入探讨了数据科学的应用，并阐述了数据分析和预测在各行各业中的影响力。在书中，作者举了许多实际例子，说明大数据如何改变我们的生活，甚至如何预测未来的趋势。

初始化当前项目。GraphRAG是围绕input文件夹下的一个又一个文件进行分析,将我们的文档放在input文件夹中，并执行如下命令后，GraphRAG会依据当前输入的文档进行相关配置文件的创建。项目初始化完成后我们会发现openl文件夹下出现提示词文件夹prompt和配置文件settings.yaml。

graphrag init --root ./openl

硅基流动

全网免费接入DeepSeek-R1平台清单，包含网站和API使用

settings.yaml

chat

Qwen3-8B

embedding

Qwen3-Embedding-8B

models:  default_chat_model:    type: openai_chat # or azure_openai_chat    api_base: https://api.siliconflow.cn/v1/    # api_version: 2024-05-01-preview    auth_type: api_key # or azure_managed_identity    api_key: 你注册的硅基流动api key # set this in the generated .env file    # audience: "https://cognitiveservices.azure.com/.default"    # organization: <organization_id>    model: Qwen/Qwen3-8B    # deployment_name: <azure_model_deployment_name>    encoding_model: cl100k_base # automatically set by tiktoken if left undefined    model_supports_json: true # recommended if this is available for your model.    concurrent_requests: 25 # max number of simultaneous LLM requests allowed    async_mode: threaded # or asyncio    retry_strategy: native    max_retries: 10    tokens_per_minute: auto              # set to null to disable rate limiting    requests_per_minute: auto            # set to null to disable rate limiting  default_embedding_model:    type: openai_embedding # or azure_openai_embedding    api_base: https://api.siliconflow.cn/v1/    # api_version: 2024-05-01-preview    auth_type: api_key # or azure_managed_identity    api_key: 你注册的硅基流动api key    # audience: "https://cognitiveservices.azure.com/.default"    # organization: <organization_id>    model: BAAI/bge-m3    # deployment_name: <azure_model_deployment_name>    encoding_model: cl100k_base # automatically set by tiktoken if left undefined    model_supports_json: true # recommended if this is available for your model.    concurrent_requests: 25 # max number of simultaneous LLM requests allowed    async_mode: threaded # or asyncio    retry_strategy: native    max_retries: 10    tokens_per_minute: auto              # set to null to disable rate limiting    requests_per_minute: auto            # set to null to disable rate limiting

settings.yaml

chunks

chunks:  size: 50 #每个文本块包含的词语数  overlap: 10 #不同文本块间重合词语数  group_by_columns: [id]

到这里我们GraphRAG的基本配置就完成了，接下来就要执行GraphRAG知识图谱构建和检索的各个环节。

三、GraphRAG使用方法

GraphRAG提供了不同层次的调用方法，默认大家使用最多的是命令行调用方法，一两行命令就可以完成。然而开发者如果想对GraphRAG进行定制化开发或针对当前系统进行调优的话，使用命令行方式就会比较困难，这时就需要用到低层次的python API访问。

本次分享中我们会介绍两种不同的调用方法:

3.1 命令行调用

3.1.1 命令行构建知识图谱

在命令行执行如下命令借助GraphRAG脚本自动执行Indexing(创建知识图谱）步骤，知识图谱在GraphRAG中默认的表现形式是我们之前讲到的实体表、关系表、实体关系表、社区表、社区报告表等各种表。从下面执行过程可以看到GraphRAGI知识图谱创建过程中一直在构建各种表：

graphrag index --root ./openl

ouput

parquet

communities.parquet: 社区表community_reports.parquet: 社区报告表（每个社区的详细信息），全局搜索会用到这张表documents.parquet: 原始文档表，我们这里只有一个txt文档entities.parquet: 实体表relationships.parquet: 关系表text_units.parquet: 文本块表

3.1.2 表格展示（非必须）

为了帮大家更直观了解GraphRAG知识图谱构建结果，我们在jupyter notebook中借助pandas库通过代码把这些文件中的表读取出来(parquet）

读取文件表:

import pandas as pddocument_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\documents.parquet')

读取文件块表

text_unit_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\text_units.parquet')

读取实体表

entities_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\entities.parquet')

读取关系表

relationships_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\relationships.parquet')

读取社区表

communities_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\communities.parquet')

读取社区报告表

community_reports_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\community_reports.parquet')

这里大家可能想问为什么我的表都是英文呢？这是因为GraphRAG知识图谱构建是依赖提示词完成实体提取、关系抽取、文本切分这些步骤，这些步骤的提示词保存在openl\prompts文件夹中，默认的提示词都是英文的，所以建表时大模型也会按照英文建。如果想输出中文表格，需要先将提示词中文件内容翻译为中文，这部分就交给大家下来自行尝试了（现在大模型能力强，英文也能理解差不多）

3.1.3 知识图谱展示（非必须）

表的内容大家已经比较熟悉了，那么这些表构成知识图谱的表现形式又是怎么样的呢？本次分享笔者还将使用python代码在jupyter notebook中将这些表构建成知识图谱:

jupyter notebook

%pip install yfiles_jupyter_graphs==1.7.3

jupyter notebook

from yfiles_jupyter_graphs import GraphWidget

convert_entities_to_dicts

实体表

convert_relationships_to_dicts

关系表

def convert_entities_to_dicts(df):    nodes_dict={}    for _, row in df.iterrows():        node_id = row['title']        if node_id not in nodes_dict:            nodes_dict[node_id] = {                "id": node_id,                "properties": row.to_dict(),            }    return list(nodes_dict.values())    def convert_relationships_to_dicts(df):    relationships=[]    for _, row in df.iterrows():        relationships.append({            "start": row['source'],            "end": row['target'],            "properties": row.to_dict(),        })    return relationships

community_to_color

点

边

edge_to_source_community

# 社区到颜色的映射def community_to_color(community):    """Map a community to a color."""    colors = [        "crimson",        "darkorange",        "indigo",        "cornflowerblue",        "cyan",        "teal",        "green",    ]    try:        return colors[int(community) % len(colors)] if community is not None else "lightgray"    except (ValueError, TypeError):        # 如果 community 不是整数或其他错误，返回默认颜色        return "lightgray"def edge_to_source_community(edge):    """Get the community of the source node of an edge."""    source_node = next(        (entry for entry in w.nodes if entry["properties"]["title"] == edge["start"]),        None,    )    source_node_community = None    return source_node_community if source_node_community is not None else None

实体表

关系表

GraphWidget

import pandas as pdentities_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\entities.parquet') # 读取实体表relationships_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\relationships.parquet') # 读取关系表w = GraphWidget()    # 创建GraphWidget对象w.directed = True    # 设置图形为有向图w.nodes = convert_entities_to_dicts(entities_df) # 将实体数据转换为节点w.edges = convert_relationships_to_dicts(relationships_df) # 将关系数据转换为边w.node_label_mapping = "title"  # 设置节点标签显示w.node_color_mapping = lambda node: community_to_color(    node["properties"].get("community", None)  # 使用 .get 方法获取属性，避免 KeyError) # 设置点颜色w.edge_color_mapping = lambda edge: community_to_color(edge_to_source_community(edge)) # 设置边颜色w.node_scale_factor_mapping = lambda node: 0.5 + node["properties"].get("size", 1) * 1.5 / 20 # 设置点的大小w.edge_thickness_factor_mapping = "weight" # 设置边的宽度与weight属性相关

绘制知识图谱如下图所示，可见知识图谱中已经将我们实体间的关系进行关联，《大数据时代》与改变我们生活、生活相连接，与原文表述一致。维克托和肯尼斯相连，因为他俩都是大数据时代的合著者。

w.circular_layout() # 设置图谱布局display(w)

3.1.4 命令行索引查询

构建完成知识图谱，生成各种表后，我们可以借助GraphRAG提供的脚本命令进行快速问答,例如我们想通过本地查询方式询问GraphRAG "请介绍《大数据时代这本书》", 可以执行如下命令: graphrag query --root D:\Learning\Learning\大模型\GraphRAG\openl --method local --query "请介绍《大数据时代》" # 本地查询提问知识图谱"《请介绍大数据时代》"即可完成问答, 执行结果如下图所示:

上图可见GraphRAG命令行检索查询还是非常简单的，如果我们要使用全局查询，只需将命令修改为graphrag query --root D:\Learning\Learning\大模型\GraphRAG\openl --method global --query "请介绍《大数据时代》" # 全局查询提问知识图谱"《请介绍大数据时代》", 具体执行结果大家下来自行尝试啦！

3.2 python API 调用

对于开发者来说，将GraphRAG的功能集成到开发代码中是首要目标。GraphRAG也提供了python API调用功能，这部分的代码我参考了GraphRAG官方：github.com/microsoft/g…

3.2.1 全局查询模式代码调用

导入相关库，可以看到全局搜索依赖的相关库非常多（其实GraphRAG还提供了深层次API来完成知识图谱构建流程，但由于代码冗杂，我们本次分享为了降低难度并没有采用，而是直接使用命令行构建知识图谱）

from collections.abc import AsyncGenerator # 异步生成器流式调用from typing import Anyfrom graphrag.callbacks.noop_query_callbacks import NoopQueryCallbacksfrom graphrag.callbacks.query_callbacks import QueryCallbacksfrom graphrag.config.models.graph_rag_config import GraphRagConfigfrom graphrag.logger.print_progress import PrintProgressLoggerfrom graphrag.query.factory import get_global_search_enginefrom graphrag.query.indexer_adapters import (    read_indexer_communities,    read_indexer_entities,    read_indexer_reports,)from graphrag.utils.api import load_search_promptimport pandas as pd

global_search_streaming

社区

社区报告

实体

知识图谱提示词

get_global_search_engine

GraphRAG检索查询详细步骤

def global_search_streaming(    config: GraphRagConfig,    entities: pd.DataFrame,    communities: pd.DataFrame,    community_reports: pd.DataFrame,    community_level: int | None,    dynamic_community_selection: bool,    response_type: str,    query: str,    callbacks: list[QueryCallbacks] | None = None,) -> AsyncGenerator:    """Perform a global search and return the context data and response via a generator.    Context data is returned as a dictionary of lists, with one list entry for each record.    Parameters    ----------    - config (GraphRagConfig): A graphrag configuration (from settings.yaml)    - entities (pd.DataFrame): A DataFrame containing the final entities (from entities.parquet)    - communities (pd.DataFrame): A DataFrame containing the final communities (from communities.parquet)    - community_reports (pd.DataFrame): A DataFrame containing the final community reports (from community_reports.parquet)    - community_level (int): The community level to search at.    - dynamic_community_selection (bool): Enable dynamic community selection instead of using all community reports at a fixed level. Note that you can still provide community_level cap the maximum level to search.    - response_type (str): The type of response to return.    - query (str): The user query to search for.    Returns    -------    TODO: Document the search response type and format.    Raises    ------    TODO: Document any exceptions to expect.    """    communities_ = read_indexer_communities(communities, community_reports)    reports = read_indexer_reports(        community_reports,        communities,        community_level=community_level,        dynamic_community_selection=dynamic_community_selection,    )    entities_ = read_indexer_entities(        entities, communities, community_level=community_level    )    map_prompt = load_search_prompt(config.root_dir, config.global_search.map_prompt)    reduce_prompt = load_search_prompt(        config.root_dir, config.global_search.reduce_prompt    )    knowledge_prompt = load_search_prompt(        config.root_dir, config.global_search.knowledge_prompt    )    search_engine = get_global_search_engine(        config,        reports=reports,        entities=entities_,        communities=communities_,        response_type=response_type,        dynamic_community_selection=dynamic_community_selection,        map_system_prompt=map_prompt,        reduce_system_prompt=reduce_prompt,        general_knowledge_inclusion_prompt=knowledge_prompt,        callbacks=callbacks,    )    return search_engine.stream_search(query=query)

编写全局搜索函数合并异步流式调用的结果

async def global_search(    config: GraphRagConfig,    entities: pd.DataFrame,    communities: pd.DataFrame,    community_reports: pd.DataFrame,    community_level: int | None,    dynamic_community_selection: bool,    response_type: str,    query: str,    callbacks: list[QueryCallbacks] | None = None,) -> tuple[    str | dict[str, Any] | list[dict[str, Any]],    str | list[pd.DataFrame] | dict[str, pd.DataFrame],]:    """Perform a global search and return the context data and response.    Parameters    ----------    - config (GraphRagConfig): A graphrag configuration (from settings.yaml)    - entities (pd.DataFrame): A DataFrame containing the final entities (from entities.parquet)    - communities (pd.DataFrame): A DataFrame containing the final communities (from communities.parquet)    - community_reports (pd.DataFrame): A DataFrame containing the final community reports (from community_reports.parquet)    - community_level (int): The community level to search at.    - dynamic_community_selection (bool): Enable dynamic community selection instead of using all community reports at a fixed level. Note that you can still provide community_level cap the maximum level to search.    - response_type (str): The type of response to return.    - query (str): The user query to search for.    Returns    -------    TODO: Document the search response type and format.    Raises    ------    TODO: Document any exceptions to expect.    """    callbacks = callbacks or []    full_response = ""    context_data = {}    def on_context(context: Any) -> None:        nonlocal context_data        context_data = context    local_callbacks = NoopQueryCallbacks()    local_callbacks.on_context = on_context    callbacks.append(local_callbacks)    async for chunk in global_search_streaming(        config=config,        entities=entities,        communities=communities,        community_reports=community_reports,        community_level=community_level,        dynamic_community_selection=dynamic_community_selection,        response_type=response_type,        query=query,        callbacks=callbacks,    ):        full_response += chunk    return full_response, context_data

settings.yaml

GraphRagConfig

import yamlwith open(settings_yaml, 'r', encoding='utf-8') as f:    values = yaml.load(f.read(), Loader=yaml.FullLoader)root_dir = r'D:\Learning\Learning\大模型\GraphRAG\openl'from pathlib import Pathvalues['root_dir'] = str(Path(root_dir).resolve())graphRagConfig = GraphRagConfig(**values)

文件表

文本块表

实体表

实体关系表

社区表

社区关系表

document_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\documents.parquet')text_unit_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\text_units.parquet')entities_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\entities.parquet')relationships_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\relationships.parquet')communities_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\communities.parquet')community_reports_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\community_reports.parquet')

执行全局搜索，提问“请介绍大数据时代”，从返回结果看回答的有模有样还是挺靠谱的！

res = await global_search(    config=graphRagConfig,    entities=entities_df,    communities=communities_df,    community_reports=community_reports_df,    community_level=2,    dynamic_community_selection=False,    response_type="Single Paragraph",    query="请介绍《大数据时代》",)

3.2.2 本地查询模式代码调用

本地查询的流程与全局查询流程类似，同样先引入大量的依赖包

from collections.abc import AsyncGeneratorfrom typing import Anyimport pandas as pdfrom graphrag.callbacks.noop_query_callbacks import NoopQueryCallbacksfrom graphrag.callbacks.query_callbacks import QueryCallbacksfrom graphrag.config.embeddings import entity_description_embeddingfrom graphrag.config.models.graph_rag_config import GraphRagConfigfrom graphrag.query.factory import get_local_search_enginefrom graphrag.query.indexer_adapters import (    read_indexer_covariates,    read_indexer_entities,    read_indexer_relationships,    read_indexer_reports,    read_indexer_text_units,)from graphrag.utils.api import (    get_embedding_store,    load_search_prompt,)from graphrag.utils.cli import redact

local_search_streaming

global_search_streaming

社区表

社区报告表

文本向量库

实体表

GraphRAG检索查询详细步骤

def local_search_streaming(    config: GraphRagConfig,    entities: pd.DataFrame,    communities: pd.DataFrame,    community_reports: pd.DataFrame,    text_units: pd.DataFrame,    relationships: pd.DataFrame,    covariates: pd.DataFrame | None,    community_level: int,    response_type: str,    query: str,    callbacks: list[QueryCallbacks] | None = None,) -> AsyncGenerator:    """Perform a local search and return the context data and response via a generator.    Parameters    ----------    - config (GraphRagConfig): A graphrag configuration (from settings.yaml)    - entities (pd.DataFrame): A DataFrame containing the final entities (from entities.parquet)    - community_reports (pd.DataFrame): A DataFrame containing the final community reports (from community_reports.parquet)    - text_units (pd.DataFrame): A DataFrame containing the final text units (from text_units.parquet)    - relationships (pd.DataFrame): A DataFrame containing the final relationships (from relationships.parquet)    - covariates (pd.DataFrame): A DataFrame containing the final covariates (from covariates.parquet)    - community_level (int): The community level to search at.    - response_type (str): The response type to return.    - query (str): The user query to search for.    Returns    -------    TODO: Document the search response type and format.    Raises    ------    TODO: Document any exceptions to expect.    """    vector_store_args = {}    for index, store in config.vector_store.items():        vector_store_args[index] = store.model_dump()    msg = f"Vector Store Args: {redact(vector_store_args)}"    description_embedding_store = get_embedding_store(        config_args=vector_store_args,        embedding_name=entity_description_embedding,    )    entities_ = read_indexer_entities(entities, communities, community_level)    covariates_ = read_indexer_covariates(covariates) if covariates is not None else []    prompt = load_search_prompt(config.root_dir, config.local_search.prompt)    search_engine = get_local_search_engine(        config=config,        reports=read_indexer_reports(community_reports, communities, community_level),        text_units=read_indexer_text_units(text_units),        entities=entities_,        relationships=read_indexer_relationships(relationships),        covariates={"claims": covariates_},        description_embedding_store=description_embedding_store,        response_type=response_type,        system_prompt=prompt,        callbacks=callbacks,    )    return search_engine.stream_search(query=query)    async def local_search(    config: GraphRagConfig,    entities: pd.DataFrame,    communities: pd.DataFrame,    community_reports: pd.DataFrame,    text_units: pd.DataFrame,    relationships: pd.DataFrame,    community_level: int,    response_type: str,    query: str,    covariates: pd.DataFrame | None = None,    callbacks: list[QueryCallbacks] | None = None,) -> tuple[    str | dict[str, Any] | list[dict[str, Any]],    str | list[pd.DataFrame] | dict[str, pd.DataFrame],]:    """Perform a local search and return the context data and response.    ----------    - config (GraphRagConfig): A graphrag configuration (from settings.yaml)    - entities (pd.DataFrame): A DataFrame containing the final entities (from entities.parquet)    - community_reports (pd.DataFrame): A DataFrame containing the final community reports (from community_reports.parquet)    - text_units (pd.DataFrame): A DataFrame containing the final text units (from text_units.parquet)    - relationships (pd.DataFrame): A DataFrame containing the final relationships (from relationships.parquet)    - covariates (pd.DataFrame): A DataFrame containing the final covariates (from covariates.parquet)    - community_level (int): The community level to search at.    - response_type (str): The response type to return.    - query (str): The user query to search for.    Returns    -------    TODO: Document the search response type and format.    Raises    ------    TODO: Document any exceptions to expect.    """    callbacks = callbacks or []    full_response = ""    context_data = {}    def on_context(context: Any) -> None:        nonlocal context_data        context_data = context    local_callbacks = NoopQueryCallbacks()    local_callbacks.on_context = on_context    callbacks.append(local_callbacks)    async for chunk in local_search_streaming(        config=config,        entities=entities,        communities=communities,        community_reports=community_reports,        text_units=text_units,        relationships=relationships,        covariates=covariates,        community_level=community_level,        response_type=response_type,        query=query,        callbacks=callbacks,    ):        full_response += chunk    return full_response, context_data    import yamlwith open(settings_yaml, 'r', encoding='utf-8') as f:    values = yaml.load(f.read(), Loader=yaml.FullLoader)root_dir = r'D:\Learning\Learning\大模型\GraphRAG\openl'from pathlib import Pathvalues['root_dir'] = str(Path(root_dir).resolve())graphRagConfig = GraphRagConfig(**values)document_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\documents.parquet')text_unit_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\text_units.parquet')entities_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\entities.parquet')relationships_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\relationships.parquet')communities_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\communities.parquet')community_reports_df = pd.read_parquet(r'D:\Learning\Learning\大模型\GraphRAG\openl\output\community_reports.parquet')

剩下的步骤基本与全局流式查询相同，这里不再多加赘述，感兴趣大家可以看我的源码。本地查询效果如下，我问的这个问题比较简单，本地查询的效果也不错。

res = await local_search(    config=graphRagConfig,    entities=entities_df,    communities=communities_df,    community_reports=community_reports_df,    text_units=text_unit_df,    relationships=relationships_df,    community_level=2,    response_type="Single Paragraph",    query="请介绍《大数据时代》",)

以上就是我们今天要分享的全部内容啦！

四、总结

本篇分享我们首先回顾了GraphRAG知识图谱构建和检索查询的基本流程，然后进行代码实战。介绍了从GraphRAG环境安装再到GraphRAG简单命令行调用和高级Python API调用的全部流程。大家看完本期分享将掌握把GraphRAG集成在自己项目开发中的能力，大幅提升检索知识增强的性能！

预计本系列分享到现在就结束啦，但还有很多内容想与大家分享，比如本地查询与全局查询的具体表现差异有哪些？、哪些问题适合本地查询，哪些适合全局查询、大数据规模下GraphRAG的执行流程是什么样的？，这些问题笔者将额外加餐，在下一篇内容大文本数据下GraphRAG执行流程实战中和大家分享，大家感兴趣可以关注笔者掘金账号，更可关注笔者同名微信公众号大模型真好玩，免费获得笔者工作实践中的各种资料，还可私信笔者交流任何大模型问题！

温馨提示

前言

一、GraphRAG流程回顾

1.1 知识图谱构建(Indexing)过程

1.2 查询(Query)过程

1.3 提示词（Prompt）调优

二、GraphRAG安装使用

2.1 GraphRAG环境搭建

2.2 GraphRAG项目配置

三、GraphRAG使用方法

3.1 命令行调用

3.1.1 命令行构建知识图谱

3.1.2 表格展示（非必须）

3.1.3 知识图谱展示（非必须）

3.1.4 命令行索引查询

3.2 python API 调用

3.2.1 全局查询模式代码调用

3.2.2 本地查询模式代码调用

四、总结

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签