未知数据源 2024年10月02日
Building a Truly Open OpenAI API Server with Open Models Locally
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

FastChat是一个开源项目,它提供了一个与OpenAI API 兼容的本地 API 服务器,允许开发者将基于 OpenAI API 的应用程序轻松迁移到开源模型,而无需修改代码。本文通过 LangChain 的示例,展示了如何将开源模型集成到本地 API 服务器,并分析了 Vicuna-13B、MPT-Chat-7B 和 OpenAI 在 LangChain 任务中的表现。

😄 **数据隐私和成本节约:** FastChat 的本地 API 服务器确保数据和交互都保留在本地机器上,避免数据泄露,并节省 API 使用成本。

🚀 **可定制性:** 本地部署允许开发者根据特定需求调整模型参数、设置甚至模型架构,提高模型输出质量和相关性。

📊 **模型比较:** 文中通过 LangChain 的问答和销售员代理任务,比较了 Vicuna-13B、MPT-Chat-7B 和 OpenAI 的性能。在问答任务中,OpenAI 在文本检索方面表现出色,而在销售员代理任务中,Vicuna 表现最好。

💡 **结论:** 对于简单的任务,开源模型可以提供与 OpenAI 模型相当的性能,并且在数据隐私和成本方面具有优势。对于复杂的任务,开源模型仍然存在差距,但其性能正在不断提升。

🤝 **感谢:** OpenAI-compatible API 服务器主要由 Shuo Yang、Siyuan Zhuang 和 Xia Han 贡献。

Many applications have been built on closed-source OpenAI APIs, but now you can effortlessly port them to use open-source alternatives without modifying the code. FastChat's OpenAI-compatible API server enables this seamless transition.In this blog post, we show how you can do this and use LangChain as an example.Demo: LangChain with Vicuna-13BHere, we present two demos of using LangChain with Vicuna-13B, a state-of-the-art open model.Question answering over docsEnliven your documents, and communicate with them through a single command line (doc).Code understandingClone the llama repository and then understand the code with a single command line, bringing your code to life (doc).The demos above are implemented directly with default LangChain code.They don't require you to adapt specifically for Vicuna. Any tool implemented with the OpenAI API can be seamlessly migrated to the open models through FastChat.Why Local API Server?Data Privacy: When using FastChat's OpenAI-compatible API server and LangChain, all the data and interactions remain on your local machine. This means you have full control over your data, and it never leaves your local environment unless you decide to share it. This local setup ensures that sensitive data isn't exposed to third-party services, reducing the risk of data breaches and ensuring compliance with data privacy regulations.Cost Saving: Traditional cloud-based API services often charge based on the number of requests or the tokens used. These costs can add up quickly, especially for researchers, organizations and companies. By running models locally, you can fully harness the power of large AI models without the worry of accumulating costs from API.Customizability: With a local setup, you have the freedom to adapt the AI model to suit your specific needs. You can experiment with different parameters, settings, or even adjust the model architecture itself. More importantly, it allows you the opportunity to fine-tune the model for certain specific behaviors. This capability gives you control not only over how the model operates but also over the quality and relevance of the output.Local OpenAI API Server with FastChatFastChat API server can interface with apps based on the OpenAI API through the OpenAI API protocol. This means that the open models can be used as a replacement without any need for code modification.The figure below shows the overall architecture.How to integrate a local model into FastChat API server? All you need to do is giving the model an OpenAI model name when launching it. See LangChain Support for details.The API server is compatible with both curl and OpenAI python package. It supports chat completions, completions, embeddings, and more.Comparing Vicuna-13B, MPT-Chat-7B, and OpenAI for using LangChainWe have conducted some preliminary testing on the open models performing LangChain tasks. These initial tests are relatively simple, including text-based question answering tasks and salesman agent performance tasks.Question Answering over DocsText-based question answering assesses the model's natural language understanding and generation abilities, and its grasp of common knowledge. We selected the transcript from the 2022 State of the Union address by President Biden as the document for querying. Six questions were posed to the model, each of which had its answer directly found within the text of the document.In terms of understanding the queries, all three models were successful. However, when it came to text retrieval ability, OpenAI demonstrated a clear advantage over Vicuna. This could very likely be attributed to the higher quality of OpenAI's embeddings, making it easier for the model to locate related contents.Salesman Agent PerformanceTo further evaluate the models' interaction capabilities, we implemented an approach by having the models take on the role of a salesman through LangChain. We posed several questions and invited GPT-4 to rate the quality of the responses provided by the different models.This test offers insights into the quality of text generation and the ability to portray a convincing agent role, aspects that are of utmost importance within LangChain. The 'salesman' scenario is a robust way to understand how effectively a model can engage in complex dialogue, showcasing its ability to respond appropriately and convincingly in a specific role. The scoring criteria here also reflects the emphasis on quality, both in terms of coherence and the ability to effectively deliver on the task of playing the role of a 'salesman'.Sales AgentWe executed SalesGPT tasks with open models and gpt-3.5-turbo. Below is the initialization code for SalesGPT.GPT4 evaluationWe posed three questions to the salesman and then let GPT-4 grade and evaluate them.Vicuna:Answer 1: 9/10 - Comprehensive and clear, emphasizing the company's mission and values.Answer 2: 9/10 - Good explanation of the unique selling proposition, but could be more explicit in differentiating from competitors.Answer 3: 10/10 - Provides detailed product information, including environmental friendliness and hypoallergenic properties.Total Score: 28/30GPT-3.5-turbo:Answer 1: 8/10 - Concise, but does not expand on the company's mission and values.Answer 2: 8/10 - Repeats previous information, does not detail the differences from competitors.Answer 3: 10/10 - Provides detailed product information, focusing on environmental friendliness and hypoallergenic properties.Total Score: 26/30MPT:Answer 1: 8/10 - Clear and succinct, but does not delve into the company's mission and values.Answer 2: 8/10 - Lacks clarity on company specifics and fails to differentiate from competitors.Answer 3: 9/10 - Provides detailed product information, but not as explicit on the environmental friendliness and hypoallergenic properties as the other two.Total Score: 25/30The Salesman test provided interesting insights into the conversational and agent capabilities of the three models: Vicuna, GPT-3.5-turbo, and MPT. Vicuna model, performed exceptionally well, earning a total score of 28 out of 30.In this particular task, the open models and GPT-3.5-turbo didn't show significant differences, suggesting that open models can serve as a viable alternative to GPT-3.5-turbo.In conclusion, it's important to note that for complex tasks, there is still a gap between open models and OpenAI models. For simpler tasks, open models can already do well. For privacy considerations and cost savings, simpler tasks can be accomplished by deploying the open model locally with FastChat.AcknowledgmentThe OpenAI-compatible API server is primarily contributed by Shuo Yang, Siyuan Zhuang, and Xia Han.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

FastChat 开源模型 OpenAI API LangChain Vicuna-13B MPT-Chat-7B 数据隐私 成本节约 可定制性
相关文章