V2EX 07月21日 11:18
[程序员] 求助, Milvus 数据库导入数据会导致数据库崩溃,是我的配置问题吗?
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文档展示了 Milvus 单机模式的 Docker Compose 配置,包括 etcd、MinIO 和 Milvus 服务。同时,提供了部分 Milvus 运行日志,日志显示了 etcd 连接异常、StreamingNode、QueryNode、MixCoord、Proxy 和 DataNode 等组件断开与 etcd 的连接,并出现 channel not exist 等错误。这些问题可能导致 Milvus 服务不稳定或无法正常运行,需要进一步排查 etcd 的健康状态和 Milvus 组件间的通信。

💡 **Milvus 单机部署配置概述**:文章提供了 Milvus v2.6.0-rc1 版本的 Docker Compose 配置文件,详细定义了 etcd、MinIO 和 Milvus 三个核心服务的容器配置。其中,etcd 被配置为使用 v3.5.18 版本,并设置了自动压缩、配额和快照等参数;MinIO 用于对象存储,配置了访问密钥和端口;Milvus 独立运行模式指定了 etcd 和 MinIO 的地址,并挂载了配置文件和数据卷,同时分配了大量的内存和 CPU 资源,表明了对高性能的要求。

⚠️ **etcd 连接异常导致服务中断**:日志显示,Milvus 的多个关键组件,包括 QueryNode、MixCoord、Proxy 和 StreamingNode,均因无法保持与 etcd 的心跳连接('etcdserver: requested lease not found')而断开,并最终导致进程退出。这表明 etcd 的稳定性和可用性是 Milvus 正常运行的基石,任何 etcd 的问题都会迅速影响到整个 Milvus 集群的稳定性。

🔥 **Streaming 管道错误与数据节点异常**:日志中出现了 'STREAMING_CODE_CHANNEL_FENCED' 和 'STREAMING_CODE_CHANNEL_NOT_EXIST' 等错误,特别是在 `timetick_sync_operator` 和 `handler_client_impl` 中。这表明在数据流处理层面,通道状态异常,可能是由于 etcd 连接问题导致的服务注册或心跳丢失,进而影响了数据节点(DataNode)和流处理节点(StreamingNode)的正常通信和数据同步,最终导致数据节点也因与 etcd 断开连接而退出。

🚀 **资源配置与潜在瓶颈分析**:虽然机器配置(4T 内存,256 CPU)和 Milvus 容器的资源分配(如 `mem_limit: 1024g`, `cpus: 32.0`)非常充足,但日志中的连接问题和通道错误表明,问题的根源并非资源不足,而是服务间的通信故障或配置错误,尤其是在 etcd 的管理和 Milvus 各组件的协调方面。

数据量:6000w

Milvus docker compose:

services:  etcd:    container_name: milvus-etcd    image: quay.io/coreos/etcd:v3.5.18    environment:      - ETCD_AUTO_COMPACTION_MODE=revision      - ETCD_AUTO_COMPACTION_RETENTION=1000      - ETCD_QUOTA_BACKEND_BYTES=8589934592      - ETCD_SNAPSHOT_COUNT=50000      - ETCD_MAX_REQUEST_BYTES=33554432    volumes:      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/etcd:/etcd    command: etcd -advertise-client-urls=http://etcd:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd    healthcheck:      test: ["CMD", "etcdctl", "endpoint", "health"]      interval: 30s      timeout: 20s      retries: 3    ulimits:      nofile:        soft: 655360        hard: 655360    mem_limit: 16g    cpus: 4.0    logging:      driver: "json-file"      options:        max-size: "100m"        max-file: "3"  minio:    container_name: milvus-minio    image: minio/minio:RELEASE.2024-05-28T17-19-04Z    environment:      MINIO_ACCESS_KEY: xxxxx      MINIO_SECRET_KEY: xxxxx    ports:      - "9001:9001"      - "9000:9000"    volumes:      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/minio:/minio_data    command: minio server /minio_data --console-address ":9001"    healthcheck:      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]      interval: 30s      timeout: 20s      retries: 3    ulimits:      nofile:        soft: 655360        hard: 655360    mem_limit: 16g    cpus: 4.0    logging:      driver: "json-file"      options:        max-size: "100m"        max-file: "3"  standalone:    container_name: milvus    image: milvusdb/milvus:v2.6.0-rc1    command: ["milvus", "run", "standalone"]    security_opt:    - seccomp:unconfined    environment:      ETCD_ENDPOINTS: etcd:2379      MINIO_ADDRESS: minio:9000      MQ_TYPE: woodpecker    volumes:      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/_milvus:/var/lib/milvus      - ./milvus.yaml:/milvus/configs/milvus.yaml    healthcheck:      test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]      interval: 30s      start_period: 90s      timeout: 20s      retries: 3    ports:      - "xxxx:19530"      - "xxxx:9091"    depends_on:      - "etcd"      - "minio"    ulimits:      nofile:        soft: 655360        hard: 655360    mem_limit: 1024g    cpus: 32.0    logging:      driver: "json-file"      options:        max-size: "100m"        max-file: "3"networks:  default:    name: milvus

机器配置:4T 内存,256 CPU

milvus.yaml: https://raw.githubusercontent.com/milvus-io/milvus/v2.6.0-rc1/configs/milvus.yaml

部分日志:

milvus        | [2025/07/18 15:41:28.119 +00:00] [WARN] [timetick/timetick_sync_operator.go:85] ["send time tick sync message failed"] [module=streamingnode] [component=timetick-sync] [pchannel=by-dev-rootcoord-dml_8:rw@3] [error="append time tick msg to wal failed, timestamp: 459499972358307846, previous message counter: 8: code: STREAMING_CODE_CHANNEL_FENCED, cause: by-dev-rootcoord-dml_8:rw@3 fenced"]milvus        | [2025/07/18 15:41:28.119 +00:00] [WARN] [timetick/timetick_sync_operator.go:85] ["send time tick sync message failed"] [module=streamingnode] [component=timetick-sync] [pchannel=by-dev-rootcoord-dml_14:rw@3] [error="append time tick msg to wal failed, timestamp: 459499972358307848, previous message counter: 8: code: STREAMING_CODE_CHANNEL_FENCED, cause: by-dev-rootcoord-dml_14:rw@3 fenced"]milvus        | [2025/07/18 15:41:28.119 +00:00] [WARN] [timetick/timetick_sync_operator.go:85] ["send time tick sync message failed"] [module=streamingnode] [component=timetick-sync] [pchannel=by-dev-rootcoord-dml_7:rw@3] [error="append time tick msg to wal failed, timestamp: 459499972358307847, previous message counter: 8: code: STREAMING_CODE_CHANNEL_FENCED, cause: by-dev-rootcoord-dml_7:rw@3 fenced"]milvus        | [2025/07/18 15:41:28.120 +00:00] [WARN] [sessionutil/session_util.go:593] ["fail to retry keepAliveOnce"] [serverName=querynode] [LeaseID=7587888197442225626] [error="etcdserver: requested lease not found"]milvus        | [2025/07/18 15:41:28.121 +00:00] [ERROR] [querynodev2/server.go:188] ["Query Node disconnected from etcd, process will exit"] ["Server Id"=2] [stack="github.com/milvus-io/milvus/internal/querynodev2.(*QueryNode).Register.func1\n\t/workspace/source/internal/querynodev2/server.go:188"]milvus        | [2025/07/18 15:41:28.121 +00:00] [WARN] [sessionutil/session_util.go:593] ["fail to retry keepAliveOnce"] [serverName=mixcoord] [LeaseID=7587888197442225598] [error="etcdserver: requested lease not found"]milvus        | [2025/07/18 15:41:28.122 +00:00] [ERROR] [coordinator/mix_coord.go:107] ["MixCoord disconnected from etcd, process will exit"] [serverID=2] [stack="github.com/milvus-io/milvus/internal/coordinator.(*mixCoordImpl).Register.(*mixCoordImpl).Register.func1.func3\n\t/workspace/source/internal/coordinator/mix_coord.go:107"]milvus        | [2025/07/18 15:41:28.122 +00:00] [WARN] [sessionutil/session_util.go:593] ["fail to retry keepAliveOnce"] [serverName=proxy] [LeaseID=7587888197442225923] [error="etcdserver: requested lease not found"]milvus        | [2025/07/18 15:41:28.122 +00:00] [ERROR] [proxy/proxy.go:181] ["Proxy disconnected from etcd, process will exit"] ["Server Id"=2] [stack="github.com/milvus-io/milvus/internal/proxy.(*Proxy).Register.func1\n\t/workspace/source/internal/proxy/proxy.go:181"]milvus        | [2025/07/18 15:41:28.122 +00:00] [WARN] [handler/handler_client_impl.go:178] ["create handler failed"] [pchannel=by-dev-rootcoord-dml_10] [handler=producer] [assignment=by-dev-rootcoord-dml_10:rw@3>2@172.23.0.4:22222] [error="/milvus.proto.streaming.StreamingNodeHandlerService/Produce; streaming error: code = STREAMING_CODE_CHANNEL_NOT_EXIST, cause = by-dev-rootcoord-dml_10 not exist; rpc error: code = FailedPrecondition, desc = "]milvus        | [2025/07/18 15:41:28.123 +00:00] [INFO] [handler/handler_client_impl.go:183] ["report assignment error"] [pchannel=by-dev-rootcoord-dml_10] [handler=producer] [assignmentError="/milvus.proto.streaming.StreamingNodeHandlerService/Produce; streaming error: code = STREAMING_CODE_CHANNEL_NOT_EXIST, cause = by-dev-rootcoord-dml_10 not exist; rpc error: code = FailedPrecondition, desc = "] []milvus        | [2025/07/18 15:41:28.120 +00:00] [ERROR] [streamingnode/service.go:389] ["StreamingNode disconnected from etcd, process will exit"] ["Server Id"=2] [stack="github.com/milvus-io/milvus/internal/distributed/streamingnode.(*Server).registerSessionToETCD.func1\n\t/workspace/source/internal/distributed/streamingnode/service.go:389"]milvus        | [2025/07/18 15:41:28.120 +00:00] [ERROR] [datanode/data_node.go:200] ["Data Node disconnected from etcd, process will exit"] ["Server Id"=2] [stack="github.com/milvus-io/milvus/internal/datanode.(*DataNode).Register.func1\n\t/workspace/source/internal/datanode/data_node.go:200"]

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Milvus Docker Compose etcd MinIO 向量数据库
相关文章