Fine Tuning LlamaX
Using Release.ai to fine tune Llama based models.
Training and tuning in house models is a great way to achieve improved performance and results from LLMs without resorting to techniques that slow down and increase cost of inference, such as RAG and context-stuffing.
Creating a fine tuning application
From the Applications page, select Create Application
then Create from Template
in the drop down. Select Runnable
as this is a job that will run and complete. For this example we'll select Axolotl
as it's the simplest to get working. Click Finish
.
Configuring your fine tuning application
To allow your job to save models to S3, you must link the application template to your S3 bucket. First you'll need to find your bucket name, which is the name of the your cluster, plus it's Release ID, plus static-builds
, separated with dashes.
To get the cluster's Release ID, click on the cluster you want to use under Configuration
> Clusters
and copy the last part of the URL. For our example cluster called release-ai
with ID of 0ezcy7w
the bucket name should be release-ai-0ezcy7w-static-builds
.
At the top level of your application template, define the s3_volumes
.
Then under the axolotl
service, add the s3 mount under the volumes
array.
Running the job
Now click the Run Job
button. Model location or name can be either a HuggingFace model ID, or an S3 location of a model. For this example we'll use meta-llama/Meta-Llama-3-8B
. Tuning dataset location should be an S3 path to a .json file containing tuning data. It should have the following format:
An example from our tuning data looks like:
You can also use datasets from HuggingFace such as mhenrichsen/alpaca_2k_test.
For this example we'll leave the model type as llama3
and click the Run Job
button.
Once the job has finished, there will be a folder called axolotl-out-YYYY-MM-DD
created in your S3 bucket.
Now you can take this folder, and load it into your favorite inference server to start sending it queries.
Last updated