MarkTechPost@AI 02月03日
Creating a Medical Question-Answering Chatbot Using Open-Source BioMistral LLM, LangChain, Chroma’s Vector Storage, and RAG: A Step-by-Step Guide
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文详细介绍了如何利用开源BioMistral LLM、LangChain、Chroma向量存储和RAG技术构建一个基于PDF的医疗问答聊天机器人。文章阐述了如何处理PDF文档,将其分割成文本块,并使用Hugging Face embeddings进行编码,存储在Chroma向量数据库中,实现高效检索。最后,通过RAG系统将检索到的上下文整合到聊天机器人的响应中,确保答案清晰且权威。这种方法能够快速处理大量的医疗PDF,提供富含上下文、准确且易于理解的见解。

🗂️ 使用LangChain的PyPDFDirectoryLoader加载PDF文档,并通过RecursiveCharacterTextSplitter将文档分割成更小的文本块,方便处理。

🧠 利用Hugging Face Embeddings将文本块转换为数值向量,捕捉深层语义关系,并使用Chroma向量数据库存储这些向量,以便高效检索。

🔗 通过检索增强生成(RAG)系统,将检索到的相关上下文信息整合到聊天机器人的响应中,确保答案的准确性和相关性。

🤖 使用LlamaCpp初始化BioMistral-7B模型,并配置温度、最大token数和top_p等参数,控制文本生成过程。

💬 构建RAG链,整合检索器、自定义提示、LLM和输出解析器,实现用户提问后,从PDF文档中检索相关信息,并生成清晰易懂的答案。

In this tutorial, we’ll build a powerful, PDF-based question-answering chatbot tailored for medical or health-related content. We’ll leveRAGe the open-source BioMistral LLM and LangChain’s flexible data orchestration capabilities to process PDF documents into manageable text chunks. We’ll then encode these chunks using Hugging Face embeddings, capturing deep semantic relationships and storing them in a Chroma vector database for high-efficiency retrieval. Finally, by employing a Retrieval-Augmented Generation (RAG) system, we’ll integrate the retrieved context directly into our chatbot’s responses, ensuring clear, authoritative answers for users. This approach allows us to rapidly sift through large volumes of medical PDFs, providing context-rich, accurate, and easy-to-understand insights.

Setting up tools

!pip install langchain sentence-transformers chromadb llama-cpp-python langchain_community pypdffrom langchain_community.document_loaders import PyPDFDirectoryLoaderfrom langchain.text_splitter import CharacterTextSplitter,RecursiveCharacterTextSplitterfrom langchain_community.embeddings import HuggingFaceEmbeddingsfrom langchain.vectorstores import FAISS, Chromafrom langchain_community.llms import LlamaCppfrom langchain.chains import RetrievalQA, LLMChainimport pathlibimport textwrapfrom IPython.display import displayfrom IPython.display import Markdowndef tomarkdown(text):    text = text.replace('•', '  *')    return Markdown(textwrap.indent(text, '> ', predicate=lambda : True))from google.colab import drivedrive.mount('/content/drive')

First, we install and configure Python packages for document processing, embedding generation, local LLMs, and advanced retrieval-based workflows with LlamaCpp. We leverage langchain_community for PDF loading and text splitting, set up RetrievalQA and LLMChain for question answering, and include a to_markdown utility plus Google Drive mounting.

Setting up API key access

from google.colab import userdata# Or use os.getenv('HUGGINGFACEHUB_API_TOKEN') to fetch an environment variable.import osfrom getpass import getpassHF_API_KEY = userdata.get("HF_API_KEY")os.environ["HF_API_KEY"] = "HF_API_KEY"

Here, we securely fetch and set the Hugging Face API key as an environment variable in Google Colab. It can also leverage the HUGGINGFACEHUB_API_TOKEN environment variable to avoid directly exposing sensitive credentials in your code.

Loading and Extracting PDFs from a Directory

loader = PyPDFDirectoryLoader('/content/drive/My Drive/Data')docs = loader.load()

We use PyPDFDirectoryLoader to scan the specified folder for PDFs, extract their text into a document list, and lay the groundwork for tasks like question answering, summarization, or keyword extraction.

Splitting Loaded Text Documents into Manageable Chunks

text_splitter = RecursiveCharacterTextSplitter(chunk_size=300, chunk_overlap=50)chunks = text_splitter.split_documents(docs)

In this code snippet, RecursiveCharacterTextSplitter is applied to break down each document in docs into smaller, more manageable segments.

Initializing Hugging Face Embeddings

embeddings = HuggingFaceEmbeddings(model_name="BAAI/bge-base-en-v1.5")

Using HuggingFaceEmbeddings, we create an object using the BAAI/bge-base-en-v1.5 model. It converts text into numerical vectors.

Building a Vector Store and Running a Similarity Search

vectorstore = Chroma.from_documents(chunks, embeddings)query = "who is at risk of heart disease"search = vectorstore.similarity_search(query)to_markdown(search[0].page_content)

We first build a Chroma vector store (Chroma.from_documents) from the text chunks and the specified embedding model. Next, you create a query asking, “who is at risk of heart disease,” and perform a similarity search against the stored embeddings. The top result (search[0].page_content) is then converted to Markdown for clearer display.

Creating a Retriever and Fetching Relevant Documents

retriever = vectorstore.as_retriever(    search_kwargs={'k': 5})retriever.get_relevant_documents(query)

We convert the Chroma vector store into a retriever (vectorstore.as_retriever) that efficiently fetches the most relevant documents for a given query. 

Initializing BioMistral-7B  Model with LlamaCpp

llm = LlamaCpp(    model_path= "/content/drive/MyDrive/Model/BioMistral-7B.Q4_K_M.gguf",    temperature=0.3,    max_tokens=2048,    top_p=1)

We set up an open-source local BioMistral LLM using LlamaCpp, pointing to a pre-downloaded model file. We also configure generation parameters such as temperature, max_tokens, and top_p, which control randomness, the maximum tokens generated, and the nucleus sampling strategy.

Setting Up a Retrieval-Augmented Generation (RAG) Chain with a Custom Prompt

from langchain.schema.runnable import RunnablePassthroughfrom langchain.schema.output_parser import StrOutputParserfrom langchain.prompts import ChatPromptTemplatetemplate = """<|context|>You are an AI assistant that follows instruction extremely well.Please be truthful and give direct answers</s><|user|>{query}</s> <|assistant|>"""prompt = ChatPromptTemplate.from_template(template)rag_chain = (    {'context': retriever, 'query': RunnablePassthrough()}    | prompt    | llm    | StrOutputParser())

Using the above, we set up an RAG pipeline using the LangChain framework. It creates a custom prompt with instructions and placeholders, incorporates a retriever for context, and leverages a language model for generating answers. The flow is defined as a series of operations (RunnablePassthrough for direct query handling, the ChatPromptTemplate for prompt construction, the LLM for response generation, and finally, the StrOutputParser to produce a clean text string).

Invoking the RAG Chain to Answer a Health-Related Query

response = rag_chain.invoke("Why should I care about my heart health?")to_markdown(response)

Now, we call the previously constructed RAG chain with a user’s query. It passes the query to the retriever, retrieves relevant context from the document collection, and feeds that context into the LLM to generate a concise, accurate answer.

In conclusion, by integrating BioMistral via LlamaCpp and taking advantage of LangChain’s flexibility, we are able to build a medical-RAG chatbot with context awareness. From chunk-based indexing to seamless RAG pipelines, it streamlines the process of mining large volumes of PDF data for relevant insights. Users receive clear and easily readable answers by formatting final responses in Markdown. This design can be extended or tailored for various domains, ensuring scalability and precision in knowledge retrieval across diverse documents.


Use the Colab Notebook here. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 75k+ ML SubReddit.

Meet IntellAgent: An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System (Promoted)

The post Creating a Medical Question-Answering Chatbot Using Open-Source BioMistral LLM, LangChain, Chroma’s Vector Storage, and RAG: A Step-by-Step Guide appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

BioMistral LLM LangChain Chroma RAG 医疗问答
相关文章