MarkTechPost@AI 02月05日
Fine-Tuning Llama 3.2 3B Instruct for Python Code: A Comprehensive Guide with Unsloth
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文详细介绍了如何使用Unsloth库对Llama 3.2 3B Instruct模型进行微调,以使其在Python代码相关任务上表现更佳。通过结合LoRA和高效的4位量化技术,可以在保证模型性能的同时,显著降低内存占用。教程涵盖了从安装必要依赖、加载Python代码数据集、初始化模型,到配置LoRA、设置训练循环以及保存微调模型的完整流程。Unsloth优化了内存使用,LoRA实现了灵活的模型适配,Hugging Face工具简化了数据集处理和训练过程,最终构建出能够生成和理解Python代码的高效模型。

⚙️安装必要的依赖项,包括Unsloth、Transformers和xFormers等库,并运行诊断命令来验证xFormers和BitsAndBytes是否成功安装。

📚加载Python代码数据集,设置序列长度为2048个tokens,并从Hugging Face加载自定义Python代码数据集,确保数据集存储在用户名下以便正确访问。

🎛️使用Unsloth库以4位格式加载Llama 3.2 3B Instruct模型,从而减少内存使用。同时,将最大序列长度设置为2048,以处理更长的文本输入。

🚀应用LoRA(低秩适应)到4位加载的模型上,指定秩(r)、alpha(lora_alpha)和dropout设置。使用gradient_checkpointing = “unsloth”可以更有效地利用内存,并允许使用更长的上下文长度进行训练。

In this tutorial, we’ll walk through how to set up and perform fine-tuning on the Llama 3.2 3B Instruct model using a specially curated Python code dataset. By the end of this guide, you’ll have a better understanding of how to customize large language models for code-related tasks and practical insight into the tools and configurations needed to leverage Unsloth for fine-tuning.

Installing Required Dependencies

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"!pip install "git+https://github.com/huggingface/transformers.git"!pip install -U trl!pip install --no-deps trl peft accelerate bitsandbytes!pip install torch torchvision torchaudio triton!pip install xformers!python -m xformers.info!python -m bitsandbytes

These commands install and update all the necessary libraries—such as Unsloth, Transformers, and xFormers—needed for fine-tuning the Llama 3.2 3B Instruct model on Python code. Finally, we run diagnostic commands to verify the successful installation of xFormers and BitsAndBytes.

Essential Imports

from unsloth import FastLanguageModelfrom trl import SFTTrainerfrom transformers import TrainingArgumentsimport torchfrom datasets import load_dataset

We import classes and functions from Unsloth, TRL, and Transformers for model training and fine-tuning. Also, we load a Python code dataset with Hugging Face’s load_dataset to prepare training samples.

Loading the Python Code Dataset

max_seq_length = 2048dataset = load_dataset("user/Llama-3.2-Python-Alpaca-143k", split="train")   #Save the dataset on your user profile on HF, then load the dataset on your user id

We set the sequence length to 2048 tokens for the fine-tuned model and load a custom Python code dataset from Hugging Face. Ensure you have the dataset stored under your username for proper access.

Initializing the Llama 3.2 3B Model

model, tokenizer = FastLanguageModel.from_pretrained(    model_name = "unsloth/Llama-3.2-3B-Instruct-bnb-4bit",    max_seq_length = max_seq_length,    dtype = None,    load_in_4bit = True)

We load the Llama 3.2 3B Instruct model in 4-bit format using the Unsloth library, which reduces memory usage. To handle longer text inputs, we also set the maximum sequence length to 2048.

Configuring LoRA with Unsloth

model = FastLanguageModel.get_peft_model(    model,    r = 16,    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",                      "gate_proj", "up_proj", "down_proj",],    lora_alpha = 16,    lora_dropout = 0, # Supports any, but = 0 is optimized    bias = "none",    # Supports any, but = "none" is optimized    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context    random_state = 3407,    use_rslora = False,  # We support rank stabilized LoRA    loftq_config = None, # And LoftQ    max_seq_length = max_seq_length)

We apply LoRA (Low-Rank Adaptation) to our 4-bit loaded model, specifying the rank (r), alpha (lora_alpha), and dropout settings. The use_gradient_checkpointing = “unsloth” enables more efficient memory usage and allows training with longer context lengths. Additional LoRA options like use_rslora and loftq_config are available for more advanced fine-tuning techniques but are disabled here for simplicity. Finally, we set the maximum sequence length to match our earlier configuration.

Mounting Google Drive

from google.colab import drivedrive.mount("/content/drive")

We import the Google Colab drive module to enable access to Google Drive from within the Colab environment.

Setting Up and Running the Training Loop

trainer = SFTTrainer(    model = model,    train_dataset = dataset,    dataset_text_field = "text",    max_seq_length = max_seq_length,    tokenizer = tokenizer,    args = TrainingArguments(        per_device_train_batch_size = 2,        gradient_accumulation_steps = 4,        warmup_steps = 10,        # num_train_epochs = 1, # Set this for 1 full training run.        max_steps = 60,        learning_rate = 2e-4,        fp16 = not torch.cuda.is_bf16_supported(),        bf16 = torch.cuda.is_bf16_supported(),        logging_steps = 1,        optim = "adamw_8bit",        weight_decay = 0.01,        lr_scheduler_type = "linear",        seed = 3407,        output_dir = "/content/drive/My Drive/Llama-3.2-3B-Instruct-bnb-4bit"    ),)trainer.train()

We create an instance of SFTTrainer with our loaded model, tokenizer, and Python code dataset, specifying the text field for training. The TrainingArguments define key hyperparameters such as batch size, learning rate, maximum training steps, and hardware-specific settings like fp16 or bf16. In this example, we set the output directory to Google Drive to conveniently store checkpoints and logs. Finally, we invoke the trainer.train() method to begin the fine-tuning process.

Saving the Fine-Tuned Model

model.save_pretrained("lora_model") # Local savingtokenizer.save_pretrained("lora_model")

We save the LoRA-trained model and its tokenizer to a local folder named lora_model. This allows you to load and use the fine-tuned model later without repeating the training process.

In conclusion, throughout this tutorial, we demonstrated how to fine-tune the Llama 3.2 3B Instruct model on a Python code dataset using the Unsloth library, LoRA, and efficient 4-bit quantization. By leveraging the provided scripts, you can train a smaller, memory-efficient model that excels at both generating and understanding Python code. In the process, we showcased the integration of Unsloth for optimized memory usage, LoRA for flexible model adaptation, and Hugging Face tools for dataset handling and training. This setup enables you to build and customize language models tailored to specific code-related tasks, improving accuracy and resource efficiency.


Download the Colab Notebook here. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 75k+ ML SubReddit.

Marktechpost is inviting AI Companies/Startups/Groups to partner for its upcoming AI Magazines on ‘Open Source AI in Production’ and ‘Agentic AI’.

The post Fine-Tuning Llama 3.2 3B Instruct for Python Code: A Comprehensive Guide with Unsloth appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Llama 3.2 Unsloth LoRA Python代码 微调
相关文章