MarkTechPost@AI 03月13日 05:59
Building an Interactive Bilingual (Arabic and English) Chat Interface with Open Source Meraj-Mini by Arcee AI: Leveraging GPU Acceleration, PyTorch, Transformers, Accelerate, BitsAndBytes, and Gradio
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了如何利用Arcee的Meraj-Mini模型在Google Colab上构建双语聊天助手。展示了在免费云资源限制下,部署先进AI解决方案的实践经验,涉及多种工具的使用。

利用多种工具,如Arcee的Meraj-Mini模型等构建聊天助手

配置4位量化设置并加载模型与分词器

创建适用于聊天交互的文本生成管道

定义函数实现对话功能并构建聊天界面

In this tutorial, we implement a Bilingual Chat Assistant powered by Arcee’s Meraj-Mini model, which is deployed seamlessly on Google Colab using T4 GPU. This tutorial showcases the capabilities of open-source language models while providing a practical, hands-on experience in deploying state-of-the-art AI solutions within the constraints of free cloud resources. We’ll utilise a powerful stack of tools including:

    Arcee’s Meraj-Mini modelTransformers library for model loading and tokenizationAccelerate and bitsandbytes for efficient quantizationPyTorch for deep learning computationsGradio for creating an interactive web interface
# Enable GPU acceleration!nvidia-smi --query-gpu=name,memory.total --format=csv# Install dependencies!pip install -qU transformers accelerate bitsandbytes!pip install -q gradio

First we enable GPU acceleration by querying the GPU’s name and total memory using the nvidia-smi command. It then installs and updates key Python libraries—such as transformers, accelerate, bitsandbytes, and gradio—to support machine learning tasks and deploy interactive applications.

import torchfrom transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, BitsAndBytesConfigquant_config = BitsAndBytesConfig(    load_in_4bit=True,    bnb_4bit_quant_type="nf4",    bnb_4bit_compute_dtype=torch.float16,    bnb_4bit_use_double_quant=True)model = AutoModelForCausalLM.from_pretrained(    "arcee-ai/Meraj-Mini",    quantization_config=quant_config,    device_map="auto")tokenizer = AutoTokenizer.from_pretrained("arcee-ai/Meraj-Mini")

Then we configures 4-bit quantization settings using BitsAndBytesConfig for efficient model loading, then loads the “arcee-ai/Meraj-Mini” causal language model along with its tokenizer from Hugging Face, automatically mapping devices for optimal performance.

chat_pipeline = pipeline(    "text-generation",    model=model,    tokenizer=tokenizer,    max_new_tokens=512,    temperature=0.7,    top_p=0.9,    repetition_penalty=1.1,    do_sample=True)

Here we create a text generation pipeline tailored for chat interactions using Hugging Face’s pipeline function. It configures maximum new tokens, temperature, top_p, and repetition penalty to balance diversity and coherence during text generation.

def format_chat(messages):    prompt = ""    for msg in messages:        prompt += f"<|im_start|>{msg['role']}n{msg['content']}<|im_end|>n"    prompt += "<|im_start|>assistantn"    return promptdef generate_response(user_input, history=[]):    history.append({"role": "user", "content": user_input})    formatted_prompt = format_chat(history)    output = chat_pipeline(formatted_prompt)[0]['generated_text']    assistant_response = output.split("<|im_start|>assistantn")[-1].split("<|im_end|>")[0]    history.append({"role": "assistant", "content": assistant_response})    return assistant_response, history

We define two functions to facilitate a conversational interface. The first function formats a chat history into a structured prompt with custom delimiters, while the second appends a new user message, generates a response using the text-generation pipeline, and updates the conversation history accordingly.

import gradio as grwith gr.Blocks() as demo:    chatbot = gr.Chatbot()    msg = gr.Textbox(label="Message")    clear = gr.Button("Clear History")       def respond(message, chathistory):        response,  = generate_response(message, chat_history.copy())        return response, chat_history + [(message, response)]    msg.submit(respond, [msg, chatbot], [msg, chatbot])    clear.click(lambda: None, None, chatbot, queue=False)demo.launch(share=True)

Finally, we build a web-based chatbot interface using Gradio. It creates UI elements for chat history, message input, and a clear history button, and defines a response function that integrates with the text-generation pipeline to update the conversation. Finally, the demo is launched with sharing enabled for public access.


Here is the Colab Notebook. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 80k+ ML SubReddit.

The post Building an Interactive Bilingual (Arabic and English) Chat Interface with Open Source Meraj-Mini by Arcee AI: Leveraging GPU Acceleration, PyTorch, Transformers, Accelerate, BitsAndBytes, and Gradio appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

双语聊天助手 Arcee的Meraj-Mini模型 AI解决方案 文本生成
相关文章