Nvidia Developer 02月16日
Build a Generative AI Medical Device Training Assistant with NVIDIA NIM Microservices
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了如何利用生成式AI和NVIDIA NIM微服务,构建一个医疗设备培训助手,旨在解决临床医生和患者在使用新型或升级医疗设备时遇到的培训和故障排除难题。该方案采用检索增强生成(RAG)技术,结合大型语言模型(LLM)和语音AI模型,使用户能够通过自然语言或语音,快速从复杂的设备使用手册(IFU)中获取所需信息。通过NIM微服务的高效部署,该助手能够实时提供准确、免提的答案,从而加速医疗设备的采用,减少错误使用,并最终提升医疗服务质量。

🚀 **RAG技术赋能**:利用检索增强生成(RAG)技术,结合大型语言模型(LLM),实现对医疗设备使用手册(IFU)的高效搜索和信息检索,用户可以通过自然语言提问,获得易于理解的指导。

🗣️ **语音AI加持**:集成语音AI模型,如自动语音识别(ASR)和文本转语音(TTS),支持用户通过语音与AI助手交互,尤其适用于手术室等无菌环境,实现免提操作。

⚙️ **NVIDIA NIM微服务**:采用GPU优化的NVIDIA NIM推理微服务,如Llama3 70B Instruct、NV-EmbedQA-e5-v5、NV-RerankQA-Mistral-4b-v3以及RIVA ASR/TTS,提供高性能的模型推理和最低的总体拥有成本,简化部署流程。

🛠️ **RAG流程详解**:RAG流程分为文档摄取和检索生成两个步骤。文档摄取阶段将IFU上传至知识库,检索生成阶段则根据用户提问,从知识库中检索相关信息,并由LLM生成答案。

📊 **性能评估与优化**:通过自定义数据集和RAGAS指标,对RAG流程的性能进行评估,包括检索器和生成器的性能,为RAG流程的优化提供数据支持。

Innovation in medical devices continues to accelerate, with a record number authorized by the FDA every year. When these new or updated devices are introduced to clinicians and patients, they require training to use them properly and safely. Once in use, clinicians or patients may need help troubleshooting issues. Medical devices are often accompanied by lengthy and technically complex Instructions for Use (IFU) manuals, which describe the correct use of the device. It can be difficult to find the right information quickly and training on a new device is a time-consuming task. Medical device representatives often provide support training, but may not be present to answer all questions in real time. These issues can cause delays in using medical devices and adopting newer technologies, and in some cases, lead to incorrect usage.Using generative AI for troubleshooting medical devicesRetrieval-augmented generation (RAG) uses deep learning models, including large language models (LLMs),  for efficient search and retrieval of information using natural language. Using RAG, users can receive easy-to-understand instructions for specific questions in a large text corpus, such as in an IFU. Speech AI models, such as automatic speech recognition (ASR) and text-to-speech (TTS) models, enable users to communicate with these advanced generative AI workflows using their voice, which is important in sterile environments like the operating room. NVIDIA NIM inference microservices are GPU-optimized and highly performant containers for these models that provide the lowest total cost of ownership and the best inference optimization for the latest models. By integrating RAG and speech AI with the efficiency and simplicity of deploying NIM microservices, companies developing advanced medical devices can provide clinicians with accurate, hands-free answers in real time.Figure 1. The chatbot user interface of the medical device training assistantA medical device training assistant built with NIM microservicesIn this tutorial, we build a RAG pipeline with optional speech capabilities to answer questions about a medical device using its IFU. The code used is available on GitHub.We use the following NIM microservices in our RAG pipeline. You have the flexibility to change the components in the pipeline to other NIM microservices for different models:Llama3 70B Instruct (meta/llama3-70b-instruct): A large language model that generates the answer to the user question based on the retrieved text.NV-EmbedQA-e5-v5 (nvidia/nv-embedqa-e5-v5): An embedding model that embeds the text chunks from the IFU and the queries from the user.NV-RerankQA-Mistral-4b-v3 (nvidia/nv-rerankqa/mistral-4b-v3): A reranking model that reranks the retrieved text chunks for the text generation step by the LLM.RIVA ASR: An automatic speech recognition model that transcribes the user’s speech query into text for the model.RIVA TTS: The text-to-speech model that outputs the audio of the response from the LLM.RAG has two steps: document ingestion, then retrieval and generation of answers. These steps and the associated NIM microservices can be found in the reference architecture diagram in Figure 2.Figure 2. The reference architecture shows document ingestion and retrievalUsing NVIDIA NIM You can access NIM microservices by signing up for free API credits on the API Catalog at build.nvidia.com or by deploying on your own compute infrastructure.In this tutorial, we use the API Catalog endpoints. More information on using NIM microservices, finding your API key, and other prerequisites can be found on GitHub. Follow these steps to build a RAG pipeline with optional speech for answering medical device questions using its IFU.Build and start the containersSee the docker compose files we’ve created to launch the containers with the NIM microservices and vector database. Detailed instructions and code can be accessed on GitHub.Ingest the device manualNavigate your browser to upload your IFU in the “Knowledge Base” tab as shown in Figure 3.Figure 3. The document ingestion page of the medical device training assistantRetrieve and generate answersNavigate to the “Converse” tab to begin the conversation with the IFU (Figure 1). Make sure to click “Use Knowledge Base” to use the IFU as a knowledge resource. To use speech to converse, click the microphone next to the text input area, and the RIVA ASR model will transcribe your question. To receive speech as an output, click the “Enable TTS output”. More information about using and troubleshooting the UI is on the GitHub documentation.Evaluate on a custom datasetEvaluate the performance of the RAG pipeline using a custom dataset of questions and automated RAGAS metrics. RAGAS metrics evaluate the performance of both the retriever and generator and are a common method for evaluating RAG pipelines in an automated fashion. Instructions on how to use the evaluation script are on GitHub.Getting startedTo get started with this workflow, visit the GenerativeAIExamples GitHub repository, which contains all of the code used in this tutorial as well as extensive documentation. For more information on NIM microservices, you can learn more from the official NIM documentation and ask questions on our NVIDIA Developer NIM Forum.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

医疗设备 人工智能 RAG NVIDIA NIM 语音AI
相关文章