【DeepSeek 聊天】五分钟部署本地 DeepSeek

前言

这里需要做到的可以根据上下文内容进行聊天，下面是做到的简单效果图

一、Ollama与DeepSeek-R1简介

1. Ollama是什么？

Ollama是一款开源的大模型本地化部署工具，支持一键安装、管理和运行多种大型语言模型

。其核心优势在于：

提供类似Docker的模型管理体验（创建、加载、运行）支持Windows/macOS/Linux多平台仅需命令行即可完成所有操作默认API接口（11434端口）便于集成其他应用

2. DeepSeek-R1:7B模型特点

作为国产明星开源模型，DeepSeek-R1:7B在代码生成与中文理解方面表现优异：

70亿参数规模平衡性能与资源消耗支持128K超长上下文处理特别优化中文语义理解能力适用于技术问答、文档总结等场景

二、Ollama安装指南（Mac & Windows）

🍎 Mac用户安装步骤

方法1：Homebrew安装（推荐）

# 安装Homebrew（已安装可跳过）/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"# 通过Homebrew安装Ollamabrew install ollama# 启动ollama serve &# 验证安装ollama --version

方法2：手动安装

访问官网下载页面：ollama.com/download/ma…

下载Ollama-darwin.zip并解压

将Ollama拖入应用程序文件夹

首次运行时在终端授权：

xattr -d com.apple.quarantine /Applications/Ollama.app

⊞ Windows用户安装步骤

访问官网下载安装包：ollama.com/download

双击运行OllamaSetup.exe

按默认设置完成安装（无需手动配置环境变量）

验证安装：

ollama --version

三、DeepSeek-R1:7B模型部署

1. 一键安装模型

# 终端执行（Mac/Win命令相同）ollama run deepseek-r1:7b

首次运行自动下载（约4.7GB）

终端显示下载进度：

pulling manifest ███████████████ 100%pulling 96c415656d37... ███████████████ 4.7GB...writing manifest success

2. 验证模型安装

ollama list

NAME              SIZE  deepseek-r1:7b    4.7GB

四、模型测试与交互方式

1. API接口测试

curl http://localhost:11434/api/generate -d '{  "model": "deepseek-r1:7b",  "prompt": "你好，你是谁",  "stream": false}'# 接口返回结构{    "model": "deepseek-r1:7b",    "created_at": "2025-07-10T02:47:51.018357Z",    "response": "<think>\n我是DeepSeek-R1，一个由深度求索公司开发的智能助手，我会尽我所能为您提供帮助。\n</think>\n\n我是DeepSeek-R1，一个由深度求索公司开发的智能助手，我会尽我所能为您提供帮助。",    "done": true,    "done_reason": "stop",    "context": [        151644,        108386,        ……    ],    "total_duration": 3981124167,    "load_duration": 2727482917,    "prompt_eval_count": 7,    "prompt_eval_duration": 185178833,    "eval_count": 53,    "eval_duration": 1064137792}

五、spring boot 中使用模型聊天

1、添加流式支持的依赖

目前的 AI 聊天都是逐字显示的，就需要用到这个流式依赖

<!-- WebFlux (SSE流式支持) --><dependency>  <groupId>org.springframework.boot</groupId>  <artifactId>spring-boot-starter-webflux</artifactId></dependency>

2、核心服务方法

package com.spring.ldj.ollama;import cn.hutool.json.JSONUtil;import com.fasterxml.jackson.core.JsonProcessingException;import com.fasterxml.jackson.databind.JsonNode;import com.fasterxml.jackson.databind.ObjectMapper;import org.apache.commons.lang3.StringUtils;import org.springframework.http.MediaType;import org.springframework.stereotype.Service;import org.springframework.web.reactive.function.client.WebClient;import reactor.core.publisher.Flux;import java.util.HashMap;import java.util.List;import java.util.Map;@Servicepublic class OllamaChatService {    private final WebClient webClient = WebClient.builder().build();    private final ObjectMapper objectMapper = new ObjectMapper();    private Map<String, Boolean> isThinking = new HashMap<>();    // 是否回显思考流程    private static final boolean SHOW_THINKING = false;    // 适合单次回答//    private static final String OLLAMA_URL = "http://localhost:11434/api/generate";    // 适合聊天模式，可根据上下文生成内容    private static final String OLLAMA_URL = "http://localhost:11434/api/chat";    public Flux<String> chat(String sessionId, String userInput) {        // 1. 添加上下文        ChatContextManager.addMessage(sessionId, "user", userInput);        List<Map<String, String>> fullContext = ChatContextManager.getContext(sessionId);        for (Map<String, String> context : fullContext) {            System.out.println(JSONUtil.parse(context));        }        // 2. 构造Ollama请求体        Map<String, Object> request = new HashMap<>();        request.put("model", "deepseek-r1:7b");        request.put("messages", fullContext);        request.put("stream", true); // 关闭流式        request.put("options", Map.of(                "temperature", 0.7,  // 控制随机性                "num_ctx", 4096      // 上下文窗口大小        ));        StringBuilder sb = new StringBuilder(); // 临时存储流式上下文        // 是否显示思考内容        isThinking.put(sessionId,true);        return webClient.post()                .uri(OLLAMA_URL)                .contentType(MediaType.APPLICATION_JSON)                .bodyValue(request)                .retrieve()                .bodyToFlux(String.class) // 接收文本流                .map(context -> parseResponseChunk(context,sessionId))                .doOnNext(content -> {                    if (StringUtils.isNotBlank(content)){                        sb.append(content);                    }                    if (content.equals("[DONE]")) {                        // 结束时保存完整上下文                        ChatContextManager.addMessage(sessionId, "assistant", sb.toString());                    }                }); // 解析每个数据块    }    private String parseResponseChunk(String json, String sessionId) {        try {            System.out.println("parseResponseChunk: " + json);            JsonNode node = objectMapper.readTree(json);            JsonNode messageNode = node.has("message") ? objectMapper.readTree(node.get("message").toString()) : objectMapper.nullNode();            String content = messageNode.has("content") ? messageNode.get("content").asText() : "";            if (!SHOW_THINKING){                if (content.equals("</think>")){                    isThinking.remove(sessionId);                    return "";                }                if (isThinking.containsKey(sessionId)){                    return "";                }            }            boolean isDone = node.has("done") && node.get("done").asBoolean();            if (isDone) {                content = "[DONE]";            }            return content;        } catch (JsonProcessingException e) {            throw new RuntimeException("JSON parsing error", e);        }    }}

3、接口

@RestController@RequestMapping("/api/ai")public class AiController {    @Autowired    private OllamaChatService ollamaChatService;    @GetMapping("/getSessionId")    public CommonResult<String> getSessionId() {        String response = DateUtil.format(new Date(), "yyyyMMddHHmmss");        return CommonResult.ok(response);    }    // SSE流式接口    @GetMapping(value = "/ask")    public SseEmitter ask1(@RequestParam String content, @RequestParam String sessionId) {        SseEmitter emitter = new SseEmitter(30_000L);        ollamaChatService.chat(sessionId,content)                .subscribe(                        chunk -> {                            try {                                emitter.send(SseEmitter.event().data(chunk));                            } catch (IOException e) {                                throw new RuntimeException(e);                            }                        }, // 发送数据块                        emitter::completeWithError, // 错误处理                        emitter::complete // 流结束                );        return emitter;// 超时30秒    }}

4、前端

<template>  <div class="chat-container">    <div class="chat-history" ref="history">      <!-- 消息容器添加对齐类 -->      <div         v-for="msg in messages"         :key="msg.id"         :class="['message', msg.type]"      >        <div class="message-bubble">          {{ msg.content }}        </div>      </div>      <div v-if="isLoading" class="loading">回答中...</div>    </div>        <div class="input-area">      <textarea         v-model="inputContent"        @keyup.enter="fetchStreamResponse"        placeholder="输入您的问题..."      ></textarea>      <button         @click="fetchStreamResponse"        :disabled="isLoading"      >        {{ isLoading ? '发送中...' : '发送' }}      </button>    </div>  </div></template><script>export default {  data() {    return {      inputContent: '',      aiResponse: "",      messages: [],      isLoading: false,      chatHistory: [],      sessionId: ''    }  },  methods: {    async fetchStreamResponse() {      // 添加用户消息      this.messages.push({        id: Date.now(),        type: "user",        content: this.inputContent      });      // 初始化AI消息      const aiMsgId = Date.now() + 1;      this.messages.push({        id: aiMsgId,        type: "ai",        content: "" // 初始为空      });      // 创建SSE连接      console.log('创建SSE连接',this.inputContent);      const eventSource = new EventSource(`/ldj001/api/ai/ask1?sessionId=${this.sessionId}&content=${encodeURIComponent(this.inputContent)}`,         { withCredentials: true } // 处理跨域      );      this.inputContent = '';      this.isLoading = true;      this.aiResponse = ""; // 重置响应内容      eventSource.onmessage = (event) => {        console.log(event);        const chunk = event.data;        // 结束        if (chunk === "[DONE]") {          eventSource.close(); // 主动关闭连接          this.isLoading = false;          return; // 终止后续处理        }        this.aiResponse += chunk;                // 更新AI消息内容        const aiMsgIndex = this.messages.findIndex(msg => msg.id === aiMsgId);        if (aiMsgIndex !== -1) {          this.messages[aiMsgIndex].content = this.aiResponse;        }        this.scrollToBottom(); // 自动滚动到底部      };      eventSource.onerror = (error) => {        console.error("Stream error:", error);        eventSource.close();      };    },    scrollToBottom() {      this.$nextTick(() => { // 确保DOM更新后执行        const container = this.$refs.history;        container.scrollTop = container.scrollHeight;      });    }  },  async created() {    try {      this.isLoading = true      const response = await fetch('/ldj001/api/ai/getSessionId', {        method: 'GET',        headers: {          'Content-Type': 'application/json'        }      })      if (!response.ok) {        throw new Error(`请求失败: ${response.status}`)      }      const rawText = await response.text() // 先获取原始文本      console.log('Raw response:', rawText) // 检查实际内容      const data = rawText ? JSON.parse(rawText) : null      this.sessionId = data.content    } catch (err) {      console.error('获取sessionId失败:', err)      this.error = err.message || '获取sessionId失败'    } finally {      this.isLoading = false    }  }}</script><style scoped>/* 基础布局 */.chat-container {  display: flex;  flex-direction: column;  height: 100vh;}.chat-history {  flex: 1;  overflow-y: auto;  padding: 20px;  display: flex;  flex-direction: column;}/* 消息通用样式 */.message {  margin-bottom: 15px;  max-width: 80%; /* 控制最大宽度 */  align-self: flex-start; /* 默认左对齐 */}.message-bubble {  padding: 12px 16px;  border-radius: 18px;  display: inline-block;  word-wrap: break-word;  white-space: pre-line; /* 保留换行符 */}/* 用户消息右对齐 */.message.user {  align-self: flex-end; /* 右对齐 */}.message.user .message-bubble {  background-color: #dcf8c6; /* 用户气泡颜色 */  border-bottom-right-radius: 4px; /* 气泡尖角效果 */}/* AI消息左对齐 */.message.ai .message-bubble {  background-color: #f0f0f0; /* AI气泡颜色 */  border-bottom-left-radius: 4px; /* 气泡尖角效果 */}/* 输入区域样式 */.input-area {  padding: 15px;  border-top: 1px solid #eee;  display: flex;  gap: 10px;}textarea {  flex: 1;  padding: 12px;  border: 1px solid #ddd;  border-radius: 20px;  resize: none;  height: 50px;}button {  padding: 0 20px;  border: none;  border-radius: 20px;  background: #0084ff;  color: white;  cursor: pointer;}</style>

前言