In this blog post, we explore the process of fine-tuning the Mistral 7B model using Hugging Face AutoTrain to enhance the generation of Midjourney prompts.
We begin by setting up the necessary working environment, ensuring all tools and dependencies are ready for seamless operation. Next, we guide you through loading the dataset, a crucial step in preparing for effective fine-tuning.
We then delve into the core of the process: fine-tuning the large language model (LLM) with AutoTrain, highlighting key techniques and configurations.
Following this, we discuss how to use the fine-tuned model for inference, enabling practical applications of your enhanced model. Finally, we cover the efficient loading of the PEFT model by utilizing the upload model feature, ensuring smooth integration into your workflows. This comprehensive guide aims to equip you with the skills needed to optimize LLMs for specific creative tasks
Table of Contents:
Setting Up Working Environment
Load the Dataset
Fine-tune LLM with AutoTrain
Put the Fine-Tuned LLM in Inference
My New E-Book: Prompt Engineering Best Practices for Instruction-Tuned LLM
I am happy to announce that I have published a new ebook Prompt Engineering Best Practices for Instruction-Tuned LLM. Prompt Engineering Best Practices for Instruction-Tuned LLM is a comprehensive guide designed to equip readers with the essential knowledge and tools to master the fine-tuning and prompt engineering of large language models (LLMs). The book covers everything from foundational concepts to advanced applications, making it an invaluable resource for anyone interested in leveraging the full potential of instruction-tuned models.
1. Setting Up Working Environment
We will start with installing two essential Python libraries: pandas and autotrain-advanced. The panda's library is commonly used for data manipulation and analysis, making it easier to structure and preprocess datasets.
Meanwhile, autotrain-advanced is a Hugging Face package that automates much of the fine-tuning process for large language models (LLMs), streamlining tasks like hyperparameter tuning and dataset handling.
The -q flag ensures the installation runs quietly, keeping the output clean and focused. Together, these tools set the foundation for fine-tuning Mistral 7B efficiently.
!pip install pandas autotrain-advanced -q
Next, we will configure AutoTrain and ensure that your environment is using the latest version of PyTorch.
!autotrain setup --update-torch
Finally, you will need to log in to HuggingFace to be able to load and train the model. You can follow the steps below to do so:
Login to Hugging Face
Go to get access token page
Create a write
token
and copy it to your clipboardRun the code below and enter your
token
from huggingface_hub import notebook_login
notebook_login()
Now that our working environment is ready we can start the training process by loading the data.
2. Load the Dataset
We will use a dataset that was used in the finetune-llama-2 GitHub repository. So we will start with cloning the GitHub repository, which contains useful scripts or configurations for fine-tuning models like Mistral 7B. By running !git clone, you download the entire repository onto your local environment.
Next, the %cd finetune-llama-2 command changes the directory into the cloned repository, allowing you to interact with its files. The %mv train.csv ../train.csv command moves a dataset file (train.csv) from the repository to the parent directory, where it can be more easily accessed for training.
Finally, %cd .. navigates back to the parent directory. These steps help you set up and organize the dataset required for fine-tuning, making sure the data is ready for use in the next stages.
!git clone https://github.com/joshbickett/finetune-llama-2.git
%cd finetune-llama-2
%mv train.csv ../train.csv
%cd ..
Now that we have the data we can read and show it using pandas.
import pandas as pd
df = pd.read_csv("train.csv")
df
To have a better idea of the data let's observe the first two examples of it. We can see that the data is used to instruct the model to write a prompt for generating images using the Midjouirney image generation tool.
df['text'][1]
###Human:
Generate a midjourney prompt for A robot on a first date###Assistant:
A robot, with a bouquet of USB cables, nervously adjusting its antennas, at a romantic restaurant that serves electricity shots.
df['text'][2]
###Human:
Generate a midjourney prompt for A snail at a speed contest###Assistant:
A snail, with a mini rocket booster, confidently lining up at the start line, with a crowd of cheering insects.
Since we already have the data ready for us as we download from another fine-tuning project there will be no need to invest time in processing and we can jump directly to the fine-tuning step.
3. Fine-tune LLM with AutoTrain
AutoTrain Advanced (or simply AutoTrain), developed by Hugging Face, is a robust no-code platform designed to simplify the process of training state-of-the-art models across multiple domains: NLP, CV, and even Tabular Data analysis.
This tool leverages the powerful frameworks created by various teams at Hugging Face, making advanced machine learning and artificial intelligence accessible to a broader audience without requiring deep technical expertise.