MarkTechPost@AI 02月10日
Tutorial to Fine-Tuning Mistral 7B with QLoRA Using Axolotl for Efficient LLM Training
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本教程演示了如何使用Axolotl和QLoRA微调Mistral 7B,展示了如何在有限的GPU资源下为新任务定制模型。我们将安装Axolotl,创建一个小型示例数据集,配置LoRA特定的超参数,运行微调过程,并测试结果模型的性能。该教程涵盖了环境准备、数据集创建、QLoRA配置、微调执行以及模型测试等关键步骤,旨在帮助用户在资源受限的环境中高效地训练和优化大型语言模型。

🛠️ **环境配置与Axolotl安装**: 教程首先介绍了如何准备环境,包括检查GPU可用性,安装Git LFS以处理大型模型文件,以及从GitHub克隆并安装Axolotl,为后续的微调工作奠定基础。

📝 **数据集与QLoRA配置**: 教程展示了如何创建一个包含指令-响应对的JSONL数据集,并构建一个YAML配置文件,该文件指向Mistral 7B基础模型,设置QLoRA参数以实现内存高效的微调,并定义训练超参数,为模型微调提供必要的训练数据和参数配置。

🚀 **Axolotl微调实战**: 通过Axolotl自动获取并下载Mistral 7B权重,然后启动基于QLoRA的微调过程,将模型量化到4位精度,从而减少GPU内存使用,展示了使用Axolotl进行模型微调的实际操作过程。

✅ **微调模型测试与验证**: 教程最后介绍了如何加载基础Mistral 7B模型并应用新训练的LoRA权重,通过一个关于经典计算和量子计算差异的示例,生成响应,验证QLoRA训练的效果,确保微调后的模型能够成功运行并生成有意义的结果。

In this tutorial, we demonstrate the workflow for fine-tuning Mistral 7B using QLoRA with Axolotl, showing how to manage limited GPU resources while customizing the model for new tasks. We’ll install Axolotl, create a small example dataset, configure the LoRA-specific hyperparameters, run the fine-tuning process, and test the resulting model’s performance.

Step 1: Prepare the Environment and Install Axolotl

# 1. Check GPU availability!nvidia-smi# 2. Install git-lfs (for handling large model files)!sudo apt-get -y install git-lfs!git lfs install# 3. Clone Axolotl and install from source!git clone https://github.com/OpenAccess-AI-Collective/axolotl.git%cd axolotl!pip install -e .# (Optional) If you need a specific PyTorch version, install it BEFORE Axolotl:# !pip install torch==2.0.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118# Return to /content directory%cd /content

First, we check which GPU is there and how much memory is there. We then install Git LFS so that large model files (like Mistral 7B) can be handled properly. After that, we clone the Axolotl repository from GitHub and install it in “editable” mode, which allows us to call its commands from anywhere. An optional section lets you install a specific PyTorch version if needed. Finally, we navigate back to the /content directory to organize subsequent files and paths neatly.

Step 2: Create a Tiny Sample Dataset and QLoRA Config for Mistral 7B

import os# Create a small JSONL datasetos.makedirs("data", exist_ok=True)with open("data/sample_instructions.jsonl", "w") as f:    f.write('{"instruction": "Explain quantum computing in simple terms.", "input": "", "output": "Quantum computing uses qubits..."}\n')    f.write('{"instruction": "What is the capital of France?", "input": "", "output": "The capital of France is Paris."}\n')# Write a QLoRA config for Mistral 7Bconfig_text = """\base_model: mistralai/mistral-7b-v0.1tokenizer: mistralai/mistral-7b-v0.1# We'll use QLoRA to minimize memory usagetrain_type: qlorabits: 4double_quant: truequant_type: nf4lora_r: 8lora_alpha: 16lora_dropout: 0.05target_modules:  - q_proj  - k_proj  - v_projdata:  datasets:    - path: /content/data/sample_instructions.jsonl  val_set_size: 0  max_seq_length: 512  cutoff_len: 512training_arguments:  output_dir: /content/mistral-7b-qlora-output  num_train_epochs: 1  per_device_train_batch_size: 1  gradient_accumulation_steps: 4  learning_rate: 0.0002  fp16: true  logging_steps: 10  save_strategy: "epoch"  evaluation_strategy: "no"wandb:  enabled: false"""with open("qlora_mistral_7b.yml", "w") as f:    f.write(config_text)print("Dataset and QLoRA config created.")

Here, we build a minimal JSONL dataset with two instruction-response pairs, giving us a toy example to train on. We then construct a YAML configuration that points to the Mistral 7B base model, sets up QLoRA parameters for memory-efficient fine-tuning, and defines training hyperparameters like batch size, learning rate, and sequence length. We also specify LoRA settings such as dropout and rank and finally save this configuration as qlora_mistral_7b.yml.

Step 3: Fine-Tune with Axolotl

# This will download Mistral 7B (~13 GB) and start fine-tuning with QLoRA.# If you encounter OOM (Out Of Memory) errors, reduce max_seq_length or LoRA rank.!axolotl --config /content/qlora_mistral_7b.yml

Here, Axolotl automatically fetches and downloads the Mistral 7B weights (a large file) and then initiates a QLoRA-based fine-tuning procedure. The model is quantized to 4-bit precision, which helps reduce GPU memory usage. You’ll see training logs that show the progress, including the training loss, step by step.

Step 4: Test the Fine-Tuned Model

import torchfrom peft import PeftModelfrom transformers import AutoModelForCausalLM, AutoTokenizer# Load the base Mistral 7B modelbase_model_path = "mistralai/mistral-7b-v0.1"   #First establish access using your user account on HF then run this partoutput_dir = "/content/mistral-7b-qlora-output"print("\nLoading base model and tokenizer...")tokenizer = AutoTokenizer.from_pretrained(    base_model_path,    trust_remote_code=True)base_model = AutoModelForCausalLM.from_pretrained(    base_model_path,    device_map="auto",    torch_dtype=torch.float16,    trust_remote_code=True)print("\nLoading QLoRA adapter...")model = PeftModel.from_pretrained(    base_model,    output_dir,    device_map="auto",    torch_dtype=torch.float16)model.eval()# Example promptprompt = "What are the main differences between classical and quantum computing?"inputs = tokenizer(prompt, return_tensors="pt").to("cuda")print("\nGenerating response...")with torch.no_grad():    outputs = model.generate(**inputs, max_new_tokens=128)response = tokenizer.decode(outputs[0], skip_special_tokens=True)print("\n=== Model Output ===")print(response)

Finally, we load the base Mistral 7B model again and then apply the newly trained LoRA weights. We craft a quick prompt about the differences between classical and quantum computing, convert it to tokens, and generate a response using the fine-tuned model. This confirms that our QLoRA training has taken effect and that we can successfully run inference on the updated model.

Snapshot of supported models with Axolotl

In conclusion, the above steps have shown you how to prepare the environment, set up a small dataset, configure LoRA-specific hyperparameters, and run a QLoRA fine-tuning session on Mistral 7B with Axolotl. This approach showcases a parameter-efficient training process suitable for resource-limited environments. You can now expand the dataset, modify hyperparameters, or experiment with different open-source LLMs to further refine and optimize your fine-tuning pipeline.


Download the Colab Notebook here. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 75k+ ML SubReddit.

Marktechpost is inviting AI Companies/Startups/Groups to partner for its upcoming AI Magazines on ‘Open Source AI in Production’ and ‘Agentic AI’.

The post Tutorial to Fine-Tuning Mistral 7B with QLoRA Using Axolotl for Efficient LLM Training appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

QLoRA Mistral 7B Axolotl 微调 LLM
相关文章