Fine-Tuning Llama 3.2 3B Instruct for Python Code: A Comprehensive Guide with Unsloth

In this tutorial, we’ll walk through how to set up and perform fine-tuning on the Llama 3.2 3B Instruct model using a specially curated Python code dataset. By the end of this guide, you’ll have a better understanding of how to customize large language models for code-related tasks and practical insight into the tools and configurations needed to leverage Unsloth for fine-tuning.

Installing Required Dependencies

Copy CodeCopiedUse a different Browser

!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"!pip install "git+https://github.com/huggingface/transformers.git"!pip install -U trl!pip install --no-deps trl peft accelerate bitsandbytes!pip install torch torchvision torchaudio triton!pip install xformers!python -m xformers.info!python -m bitsandbytes

These commands install and update all the necessary libraries—such as Unsloth, Transformers, and xFormers—needed for fine-tuning the Llama 3.2 3B Instruct model on Python code. Finally, we run diagnostic commands to verify the successful installation of xFormers and BitsAndBytes.

Essential Imports

Copy CodeCopiedUse a different Browser

from unsloth import FastLanguageModelfrom trl import SFTTrainerfrom transformers import TrainingArgumentsimport torchfrom datasets import load_dataset

We import classes and functions from Unsloth, TRL, and Transformers for model training and fine-tuning. Also, we load a Python code dataset with Hugging Face’s load_dataset to prepare training samples.

Loading the Python Code Dataset

Copy CodeCopiedUse a different Browser

max_seq_length = 2048dataset = load_dataset("user/Llama-3.2-Python-Alpaca-143k", split="train")   #Save the dataset on your user profile on HF, then load the dataset on your user id

We set the sequence length to 2048 tokens for the fine-tuned model and load a custom Python code dataset from Hugging Face. Ensure you have the dataset stored under your username for proper access.

Initializing the Llama 3.2 3B Model

Copy CodeCopiedUse a different Browser

model, tokenizer = FastLanguageModel.from_pretrained(    model_name = "unsloth/Llama-3.2-3B-Instruct-bnb-4bit",    max_seq_length = max_seq_length,    dtype = None,    load_in_4bit = True)

We load the Llama 3.2 3B Instruct model in 4-bit format using the Unsloth library, which reduces memory usage. To handle longer text inputs, we also set the maximum sequence length to 2048.

Configuring LoRA with Unsloth

Copy CodeCopiedUse a different Browser

model = FastLanguageModel.get_peft_model(    model,    r = 16,    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",                      "gate_proj", "up_proj", "down_proj",],    lora_alpha = 16,    lora_dropout = 0, # Supports any, but = 0 is optimized    bias = "none",    # Supports any, but = "none" is optimized    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context    random_state = 3407,    use_rslora = False,  # We support rank stabilized LoRA    loftq_config = None, # And LoftQ    max_seq_length = max_seq_length)

We apply LoRA (Low-Rank Adaptation) to our 4-bit loaded model, specifying the rank (r), alpha (lora_alpha), and dropout settings. The use_gradient_checkpointing = “unsloth” enables more efficient memory usage and allows training with longer context lengths. Additional LoRA options like use_rslora and loftq_config are available for more advanced fine-tuning techniques but are disabled here for simplicity. Finally, we set the maximum sequence length to match our earlier configuration.

Mounting Google Drive

Copy CodeCopiedUse a different Browser

from google.colab import drivedrive.mount("/content/drive")

We import the Google Colab drive module to enable access to Google Drive from within the Colab environment.

Setting Up and Running the Training Loop

Copy CodeCopiedUse a different Browser

trainer = SFTTrainer(    model = model,    train_dataset = dataset,    dataset_text_field = "text",    max_seq_length = max_seq_length,    tokenizer = tokenizer,    args = TrainingArguments(        per_device_train_batch_size = 2,        gradient_accumulation_steps = 4,        warmup_steps = 10,        # num_train_epochs = 1, # Set this for 1 full training run.        max_steps = 60,        learning_rate = 2e-4,        fp16 = not torch.cuda.is_bf16_supported(),        bf16 = torch.cuda.is_bf16_supported(),        logging_steps = 1,        optim = "adamw_8bit",        weight_decay = 0.01,        lr_scheduler_type = "linear",        seed = 3407,        output_dir = "/content/drive/My Drive/Llama-3.2-3B-Instruct-bnb-4bit"    ),)trainer.train()

We create an instance of SFTTrainer with our loaded model, tokenizer, and Python code dataset, specifying the text field for training. The TrainingArguments define key hyperparameters such as batch size, learning rate, maximum training steps, and hardware-specific settings like fp16 or bf16. In this example, we set the output directory to Google Drive to conveniently store checkpoints and logs. Finally, we invoke the trainer.train() method to begin the fine-tuning process.

Saving the Fine-Tuned Model

Copy CodeCopiedUse a different Browser

model.save_pretrained("lora_model") # Local savingtokenizer.save_pretrained("lora_model")

We save the LoRA-trained model and its tokenizer to a local folder named lora_model. This allows you to load and use the fine-tuned model later without repeating the training process.

In conclusion, throughout this tutorial, we demonstrated how to fine-tune the Llama 3.2 3B Instruct model on a Python code dataset using the Unsloth library, LoRA, and efficient 4-bit quantization. By leveraging the provided scripts, you can train a smaller, memory-efficient model that excels at both generating and understanding Python code. In the process, we showcased the integration of Unsloth for optimized memory usage, LoRA for flexible model adaptation, and Hugging Face tools for dataset handling and training. This setup enables you to build and customize language models tailored to specific code-related tasks, improving accuracy and resource efficiency.

Download the Colab Notebook here. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 75k+ ML SubReddit.

Marktechpost is inviting AI Companies/Startups/Groups to partner for its upcoming AI Magazines on ‘Open Source AI in Production’ and ‘Agentic AI’.

The post Fine-Tuning Llama 3.2 3B Instruct for Python Code: A Comprehensive Guide with Unsloth appeared first on MarkTechPost.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签