通过遥测工具抓取了一下 Gemini CLI 的对话消息记录,并做了一下简单的分析,感觉有所启发,记录一下。对话内容是让Gemini CLI 解析一下自己的代码,下面就来逐条消息进行一下分析。
初始设置
第一条用户 prompt 是由 Gemini CLI 发出的,内容如下
{ "role": "user", "parts": [ { "text": "This is the Gemini CLI. We are setting up the context for our chat.\n Today's date is 2025年8月1日星期五.\n My operating system is: darwin\n I'm currently working in the directory: /Users/xxx/mycode/gemini-cli\n Showing up to 200 items (files + folders). Folders or files indicated with ... contain more items not shown, were ignored, or the display limit (200 items) was reached.\n\n/Users/xxx/mycode/gemini-cli/\n├───.editorconfig\n├───.gitattributes\n├───.gitignore\n├───.npmrc\n├───.nvmrc\n├───.prettierrc.json\n├───CONTRIBUTING.md\n├───Dockerfile\n├───esbuild.config.js\n├───eslint.config.js\n├───GEMINI.md\n├───LICENSE\n├───Makefile\n├───package-lock.json\n├───package.json\n├───pnpm-lock.yaml\n├───README.md\n├───ROADMAP.md\n├───tsconfig.json\n├───.gcp/\n│ ├───Dockerfile.gemini-code-builder\n│ └───release-docker.yaml\n├───.gemini/\n├───.git/...\n├───.github/\n│ ├───CODEOWNERS\n│ ├───pull_request_template.md\n│ ├───actions/\n ...省略 " } ]}
从记录中可以清楚看到第一条消息提供了几个比较重要的信息
- 首先是告诉模型目前角色是 Gemini CLI:
This is the Gemini CLI
紧接着是明确告诉模型现在是在做上下文设定: We are setting up the context for our chat.
再接着是告诉模型当前的时间、操作系统: Today's date is 2025年8月1日星期五.\n My operating system is: darwin
最后是告诉模型当前的工作目录,以及目录树结构(限制了两百个)模型接收到这些信息后就有了第一条返回
{ "role": "model", "parts": [{ "text": "Got it. Thanks for the context!" }]}
很礼貌的回复了一下 "Got it. Thanks for the context!"
真实用户 prompt
在端对话中,由用户输入的prompt其实只有一句:
{ "role": "user", "parts": [ { "text": "帮我分析一下@packages/core/ 中的代码,详细分析其技术架构、工作原理,并画出流程图" } ]}
模型回复
在消息记录中模型的回复内容为:
{ "role": "model", "parts": [ { "text": "好的,我们来深入分析一下 `@packages/core` 的代码。\n\n这个包是 Gemini CLI 的核心,负责处理与 Google AI 模型的交互、工具执行、状态管理和配置等关键任务。可以把它看作是连接用户输入(来自 `packages/cli`)和模型能力(Google AI)的“大脑”。\n\n首先,我需要浏览一下 `@packages/core` 的目录结构来更好地理解其内部模块。\n\n" } ]},{ "role": "model", "parts": [ { "functionCall": { "name": "list_directory", "args": { "path": "/Users/xxx/mycode/gemini-cli/packages/core/src/" } } } ]}
这一段其实只包含了模型思考后的正式回复,完整的信息也抓取到了,如下,可以看到其中模型的思考过程,消息历史记录中思考的部分都被省略了
{ "type": "response", "content": [ { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Initiating Code Analysis**\n\n\\n\\n\n\nI'm focusing on breaking down the user's request. My first step is identifying the core task, which is to analyze the `@packages/core/` directory. Then, I will address the required outputs, like detailed code summaries and potential vulnerabilities. I'm now considering how best to structure these outputs for clarity and impact.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Refining Output Structures**\n\n\\n\\n\n\nI'm working on the structure of the technical architecture analysis. I'm also structuring the explanation of working principles, ensuring clarity and readability in Chinese. The flowchart generation is the next step; I'm considering tools and approaches to best visualize the code's logic.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Examining Module Interactions**\n\n\\n\\n\n\nI've just finished analyzing `index.ts` to understand the module exports. The next step is examining the interactions between modules, especially how `core`, `mcp`, and `config` are connected. I'll focus on function calls and dependencies to build a clear map of data flow. This will help me understand the technical architecture better, which is what I'm targeting next.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Defining Core Package Modules**\n\n\\n\\n\n\nI've moved on to analyzing the `index.ts` files within key subdirectories, like `core`, `mcp`, and `tools`. I'm identifying what each module exports to understand its role in the larger system. Now I'm beginning to generate initial descriptions for each module, and I'm also thinking about the interactions and data flow.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Examining Core Module Structure**\n\n\\n\\n\n\nI'm initiating the directory listing to obtain the complete file structure. After that, I'll analyze the `index.ts` file to understand the module's public API. I will then examine the key subdirectories. This approach helps me grasp the core functions and the relationships among the modules within the package.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Summarizing Module Roles**\n\n\\n\\n\n\nI'm currently reading the `index.ts` files within the `core`, `mcp`, `tools`, and `config` directories. I'm focusing on the function exports to determine each module's function within the larger context. The key objective is to understand their roles in the system. I am also starting to think about the data flow among them.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Defining Initial Hypotheses**\n\n\\n\\n\n\nI'm consolidating my initial findings. I now understand the directory structure and the basic responsibilities of the modules based on the `index.ts` exports. My next step is to confirm the roles of each module, specifically `core`, `mcp`, and `tools`. Then I will start synthesizing the response in Chinese and begin generating a text-based flowchart.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Defining Execution Steps**\n\n\\n\\n\n\nI'm planning the execution phase, including the order of file reads and the analysis process for each. I will start by listing the directory's content, then proceed with analyzing key files like `index.ts` in core, mcp, and tools. The goal is to extract module exports and understand interdependencies. I'm structuring my actions using the identified tool usage.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Structuring Architectural Overview**\n\n\\n\\n\n\nI'm concentrating on a response structure for the detailed architectural overview. I'm dividing the technical architecture section by module (core, mcp, tools, config, etc.) for clarity. Also, I'm planning the end-to-end flow description. I will then use a numbered list to guide the user through the steps in a typical interaction. After this, I will build the text-based Mermaid flowchart for visualization.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "thought": true, "text": "**Defining Core Modules**\n\n\\n\\n\n\nI'm focused on using the directory listing and examining `index.ts` files to define key modules. My goal is to understand the responsibilities of `core`, `mcp`, `tools`, and `config`. I am also planning to include a review of `services`, `telemetry`, and `utils`. The next phase is synthesizing my understanding of the data flow between the modules.\n\n\n" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "text": "好的,我们来深入分析一下 `@packages/core` 的代码。\n\n这个包是 Gemini CLI 的核心," } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "text": "负责处理与 Google AI 模型的交互、工具执行、状态管理和配置等关键任务。可以把它看作是连接用户输入(来自 `packages/cli`)和模型能力(Google AI)的“大脑”。\n\n首先,我需要浏览一下 `@packages/core` 的目录结构来更好地" } ] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [{ "text": "理解其内部模块。\n\n" }] } } ], "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" } }, { "candidates": [ { "content": { "role": "model", "parts": [ { "functionCall": { "name": "list_directory", "args": { "path": "/Users/xupeng/mycode/gemini-cli/packages/core/src/" } } } ] }, "finishReason": "STOP" } ], "usageMetadata": { "promptTokenCount": 12968, "candidatesTokenCount": 120, "totalTokenCount": 15550, "trafficType": "PROVISIONED_THROUGHPUT", "promptTokensDetails": [{ "modality": "TEXT", "tokenCount": 12968 }], "candidatesTokensDetails": [ { "modality": "TEXT", "tokenCount": 120 } ], "thoughtsTokenCount": 2462 } } ] }
工具调用及结果返回
可以看到在上一段的模型回复中模型发回了一段特殊的回复,parts 中只包含了 functionCall 节点,这其实就是模型调用对应工具的的方式,这个回复中包含了需要调用的工具名称以及参数信息。Gemini CLI 在收到工具调用信息后会根据相关信息执行工具调用,然后将工具调用的结果再次返回给模型,具体的消息如下:
{ "role": "user", "parts": [ { "functionResponse": { "id": "list_directory-1754016459561-12bef4b2f24e9", "name": "list_directory", "response": { "output": "Directory listing for /Users/xxx/mycode/gemini-cli/packages/core/src/:\n[DIR] __mocks__\n[DIR] code_assist\n[DIR] config\n[DIR] core\n[DIR] mcp\n[DIR] services\n[DIR] telemetry\n[DIR] tools\n[DIR] utils\nindex.test.ts\nindex.ts" } } } ]}
总结
通过分析交互历史数据,我们可以看到 Gemini CLI 与用户之间的交互遵循一个清晰的流程:
- 上下文设置:CLI 首先向模型提供详细的上下文信息,包括系统环境、目录结构等用户请求:用户提出具体的问题或任务请求模型处理:模型分析请求并决定下一步行动工具调用:模型可能需要调用工具来获取更多信息或执行操作结果反馈:工具执行结果返回给模型,模型生成最终响应用户响应:最终响应返回给用户
附录
最后附上完整的消息记录,供参考
{ "type": "request", "content": [ { "role": "user", "parts": [ { "text": "This is the Gemini CLI. We are setting up the context for our chat.\n Today's date is 2025年8月1日星期五.\n My operating system is: darwin\n I'm currently working in the directory: /Users/xxx/mycode/gemini-cli\n Showing up to 200 items (files + folders). Folders or files indicated with ... contain more items not shown, were ignored, or the display limit (200 items) was reached.\n\n/Users/xxx/mycode/gemini-cli/\n├───.editorconfig\n├───.gitattributes\n├───.gitignore\n├───.npmrc\n├───.nvmrc\n├───.prettierrc.json\n├───CONTRIBUTING.md\n├───Dockerfile\n├───esbuild.config.js\n├───eslint.config.js\n├───GEMINI.md\n├───LICENSE\n├───Makefile\n├───package-lock.json\n├───package.json\n├───pnpm-lock.yaml\n├───README.md\n├───ROADMAP.md\n├───tsconfig.json\n├───.gcp/\n│ ├───Dockerfile.gemini-code-builder\n│ └───release-docker.yaml\n├───.gemini/\n├───.git/...\n├───.github/\n│ ├───CODEOWNERS\n│ ├───pull_request_template.md\n│ ├───actions/\n ...省略 " } ] }, { "role": "model", "parts": [{ "text": "Got it. Thanks for the context!" }] }, { "role": "user", "parts": [ { "text": "帮我分析一下@packages/core/ 中的代码,详细分析其技术架构、工作原理,并画出流程图" } ] }, { "role": "model", "parts": [ { "text": "好的,我们来深入分析一下 `@packages/core` 的代码。\n\n这个包是 Gemini CLI 的核心,负责处理与 Google AI 模型的交互、工具执行、状态管理和配置等关键任务。可以把它看作是连接用户输入(来自 `packages/cli`)和模型能力(Google AI)的“大脑”。\n\n首先,我需要浏览一下 `@packages/core` 的目录结构来更好地理解其内部模块。\n\n" } ] }, { "role": "model", "parts": [ { "functionCall": { "name": "list_directory", "args": { "path": "/Users/xxx/mycode/gemini-cli/packages/core/src/" } } } ] }, { "role": "user", "parts": [ { "functionResponse": { "id": "list_directory-1754016459561-12bef4b2f24e9", "name": "list_directory", "response": { "output": "Directory listing for /Users/xxx/mycode/gemini-cli/packages/core/src/:\n[DIR] __mocks__\n[DIR] code_assist\n[DIR] config\n[DIR] core\n[DIR] mcp\n[DIR] services\n[DIR] telemetry\n[DIR] tools\n[DIR] utils\nindex.test.ts\nindex.ts" } } } ] }, { "role": "model", "parts": [ { "text": "好的,我已经列出了 `@packages/core/src` 下的目录。现在我将阅读 `index.ts` 文件,以了解这个包对外暴露的 API 以及各个模块是如何集成的。这能让我对整个包的架构有一个宏观的认识。" } ] }, { "role": "model", "parts": [ { "functionCall": { "name": "read_file", "args": { "absolute_path": "/Users/xxx/mycode/gemini-cli/packages/core/src/index.ts" } } } ] }, { "role": "user", "parts": [ { "functionResponse": { "id": "read_file-1754016463341-b50c12f678538", "name": "read_file", "response": { "output": "/**\n * @license\n * Copyright 2025 Google LLC\n * SPDX-License-Identifier: Apache-2.0\n */\n\n// Export config\nexport * from './config/config.js';\n\n// Export Core Logic\nexport * from './core/client.js';\nexport * from './core/contentGenerator.js';\nexport * from './core/geminiChat.js';\nexport * from './core/logger.js';\nexport * from './core/prompts.js';\nexport * from './core/tokenLimits.js';\nexport * from './core/turn.js';\nexport * from './core/geminiRequest.js';\nexport * from './core/coreToolScheduler.js';\nexport * from './core/nonInteractiveToolExecutor.js';\n\n ...省略 " } } } ] }, { "role": "model", "parts": [ { "text": "好的,我已经分析了 `index.ts` 文件,对 `@packages/core` 模块的整体结构有了清晰的认识。这个包导出了大量功能,可以归纳为以下几个关键领域:\n\n1. **核心逻辑 (Core Logic):** 这似乎是包的核心,处理主要的聊天循环、与 Gemini 模型的交互以及工具执行。关键文件包括 `geminiChat.ts`、`coreToolScheduler.ts` 和 `geminiRequest.ts`。\n2. **工具 (Tools):** 大量代码用于定义和实现模型可以使用的各种工具。这包括文件系统操作(`read-file.js`、`write-file.js`、`ls.js`)、搜索(`grep.js`、`glob.js`)、Web 交互(`web-fetch.js`、`web-search.js`)等。`tool-registry.js` 可能在管理这些工具方面扮演着核心角色。\n3. **配置 (Configuration):** `config` 目录及其导出表明存在一个用于管理应用程序设置的系统。\n4. **服务 (Services):** 这包括用于文件发现(`fileDiscoveryService.js`)、Git 集成(`gitService.js`)和 IDE 上下文(`ideContext.js`)的服务,这些服务为模型提供上下文信息。\n5. **代码辅助 (Code Assist):** 这似乎是一个独立的功能集,与通过本地服务器和 OAuth 认证提供编码辅助有关。\n6. **MCP (Model-Context Protocol):** `mcp` 和 `MCPOAuth` 相关文件的存在表明实现了一种协议,供模型与其环境(可能与工具有关)进行交互。\n7. **工具集 (Utilities):** 用于各种任务的辅助函数集合,如路径操作、模式验证、错误处理等。\n8. **遥测 (Telemetry):** 用于收集使用数据和分析。\n\n现在,为了提供详细的技术架构和工作流程,我将按照以下计划进行:\n\n1. **分析核心聊天循环:** 我将首先检查 `geminiChat.ts`,以了解如何处理用户回合、如何调用模型以及如何调用工具。这将是工作流程的核心部分。\n2. **研究工具调度和执行:** 我将查看 `coreToolScheduler.ts` 和 `nonInteractiveToolExecutor.ts`,以了解如何决定使用工具以及如何执行工具的代码。\n3. **检查工具注册表:** 我将阅读 `tool-registry.js`,以了解工具是如何注册并提供给核心逻辑的。\n4. **综合信息并绘制流程图:** 根据收集到的信息,我将构建对架构的详细解释,并创建一个 Mermaid 流程图来可视化该过程。\n\n我将从阅读 `geminiChat.ts` 开始。" } ] }, { "role": "model", "parts": [ { "functionCall": { "name": "read_file", "args": { "absolute_path": "/Users/xxx/mycode/gemini-cli/packages/core/src/core/geminiChat.ts" } } } ] }, { "role": "user", "parts": [ { "functionResponse": { "id": "read_file-1754016476967-aa8fbc32d444e8", "name": "read_file", "response": { "output": "/**\n * @license\n * Copyright 2025 Google LLC\n * SPDX-License-Identifier: Apache-2.0\n */\n\n// DISCLAIMER: This is a copied version of https://github.com/googleapis/js-genai/blob/main/src/chats.ts with the intention of working around a key bug\n// where function responses are not treated as \"valid\" responses: https://b.corp.google.com/issues/420354090\n\nimport {\n GenerateContentResponse,\n Content,\n GenerateContentConfig,\n SendMessageParameters,\n createUserContent,\n Part,\n GenerateContentResponseUsageMetadata,\n Tool,\n} from '@google/genai';\n ...省略" } } } ] }, { "role": "model", "parts": [ { "text": "我分析了 `geminiChat.ts` 的代码,以下是我的发现:\n\n* **`GeminiChat` 类:** 这是管理与 Gemini 模型对话的主要类。它维护聊天记录,并提供向模型发送消息的方法。\n* **`sendMessage` 和 `sendMessageStream`:** 这是与模型交互的两个主要方法。`sendMessage` 发送单条消息并等待完整响应,而 `sendMessageStream` 返回一个异步生成器,该生成器在响应块可用时产生它们。\n* **历史记录管理:** 该类仔细管理对话历史记录,区分“综合”历史记录(所有回合)和“精选”历史记录(仅有效回合)。这对于确保模型接收到干净有效的上下文以生成响应非常重要。\n* **错误处理和重试:** 代码包含在出现瞬时错误(如 429 或 5xx 状态码)时使用退避重试 API 调用的逻辑。它还有一个机制,在出现持久性配额问题时回退到 Flash 模型。\n* **工具集成:** `setTools` 方法和 `sendMessage` 响应中对 `automaticFunctionCallingHistory` 的处理表明,该类负责将工具定义传递给模型,并处理模型使用这些工具的请求。\n* **日志/遥测:** 该类记录 API 请求、响应和错误,这对于调试和监控至关重要。\n\n既然我已经很好地理解了聊天循环,我将继续执行我的计划的下一步:分析工具调度和执行。我将从阅读 `coreToolScheduler.ts` 开始。" } ] }, { "role": "model", "parts": [ { "functionCall": { "name": "read_file", "args": { "absolute_path": "/Users/xxx/mycode/gemini-cli/packages/core/src/core/coreToolScheduler.ts" } } } ] }, { "role": "user", "parts": [ { "functionResponse": { "id": "read_file-1754016486197-275c5c5c12e9e", "name": "read_file", "response": { "output": "/**\n * @license\n * Copyright 2025 Google LLC\n * SPDX-License-Identifier: Apache-2.0\n */\n\nimport {\n ToolCallRequestInfo,\n ToolCallResponseInfo,\n ToolConfirmationOutcome,\n Tool,\n ToolCallConfirmationDetails,\n ToolResult,\n ToolResultDisplay,\n ToolRegistry,\n ApprovalMode,\n EditorType,\n Config,\n logToolCall,\n ToolCallEvent,\n ToolConfirmationPayload,\n} from '../index.js';\nimport { Part, PartListUnion } from '@google/genai';\nimport { getResponseTextFromParts } from '../utils/generateContentResponseUtilities.js';\n ...省略" } } } ] }, { "role": "model", "parts": [ { "text": "`coreToolScheduler.ts` 的分析证实了我的猜想,并提供了关于工具生命周期的更多细节。以下是主要发现:\n\n* **`CoreToolScheduler` 类:** 这个类是协调工具执行的核心。它接收来自模型的工具调用请求,并通过一个明确定义的生命周期来管理它们。\n* **工具调用生命周期:** 工具调用会经历多个状态:`validating`、`awaiting_approval`、`scheduled`、`executing`、`success`、`error` 和 `cancelled`。这个状态机确保了工具调用被正确处理,包括用户批准和错误处理。\n* **用户批准流程:** 调度器支持一个批准模式 (`approvalMode`),允许在执行工具之前需要用户确认。这是通过 `shouldConfirmExecute` 方法和 `ConfirmHandler` 实现的,这对于安全性至关重要。\n* **可修改的工具:** 代码中提到了“可修改的工具”,这表明一些工具(比如文件编辑工具)允许用户在执行前修改参数。这是通过与编辑器集成来实现的。\n* **执行和结果处理:** 一旦工具调用被批准并调度,调度器就会执行它。然后,它会将工具的输出(无论是成功还是错误)格式化为模型可以理解的 `functionResponse`。\n* **与 `ToolRegistry` 的集成:** 调度器依赖于 `ToolRegistry` 来查找和获取由其名称指定的实际工具实现。\n\n现在,我已经对聊天循环和工具调度有了很好的理解。最后一步是检查 `tool-registry.js`,以了解工具是如何被发现和注册的。" } ] }, { "role": "model", "parts": [ { "functionCall": { "name": "read_file", "args": { "absolute_path": "/Users/xxx/mycode/gemini-cli/packages/core/src/tools/tool-registry.ts" } } } ] }, { "role": "user", "parts": [ { "functionResponse": { "id": "read_file-1754016492794-b8739c79ea24e", "name": "read_file", "response": { "output": "/**\n * @license\n * Copyright 2025 Google LLC\n * SPDX-License-Identifier: Apache-2.0\n */\n\nimport { FunctionDeclaration, Schema, Type } from '@google/genai';\nimport { Tool, ToolResult, BaseTool, Icon } from './tools.js';\nimport { Config } from '../config/config.js';\nimport { spawn } from 'node:child_process';\nimport { StringDecoder } from 'node:string_decoder';\nimport { discoverMcpTools } from './mcp-client.js';\nimport { DiscoveredMCPTool } from './mcp-tool.js';\n ...省略" } } } ] } ]}