LogoLogo
  • Welcome to Release
  • Getting started
    • Quickstart
    • Create an account
    • Prepare to use Release
    • Create an application
      • Create custom application
      • Create from template
      • Servers vs runnables
    • Create an environment
  • Guides and examples
    • Domains and DNS
      • Manage domains
      • DNS and nameservers
        • Configure GoDaddy
        • Configure Cloudflare
        • Configure Namecheap
        • Other DNS hosts
      • Routing traffic
    • Example applications
      • Full stack voting app
      • Flask and RDS counter app
      • Static site with Gatsby
      • Golang with Postgres and Nginx
      • WordPress with MySQL
      • Spring and PostgreSQL
      • Terraform and Flask
      • OpenTelemetry demo
      • Load balancer with hostname
      • Static JavaScript service
      • SSH bastion access to services
      • ngrok and OAuth for private tunnels
      • Using OAuth Proxy
      • Hybrid Docker and static site
      • App Imports: Connecting two applications
      • Example library
    • Running instances
      • Cron jobs
      • Jobs
      • Using Helm charts
      • Using terminal
      • Viewing logs
      • Troubleshooting
        • ImagePullBackoff error
        • CrashLoopBackoff error
        • Exit codes
        • OOM: out of memory
    • Advanced guides
      • Containers guide
      • Application guide
      • Kubernetes guide
      • Create a cluster
      • Upgrade a cluster
      • Managing node groups
      • Patch node groups
      • Hostnames and rules
      • Serve traffic on multiple ports
      • Configure access to your K8s cluster
      • Designing for multiple environments
      • Microservices architecture
      • Monitoring your clusters
      • Performance tuning
      • Visibility and monitoring
      • Working with data
        • Container-based data
        • Seeding and migration
        • Cloud-provided data
        • Golden images
        • Third party
      • Pausing Instant Datasets
        • Application pausing schedules
        • Pause/resume environments
      • Infrastructure as code
        • Terraform
  • Reference documentation
    • Account settings
      • Account info
      • Managing users
      • Build settings
        • Build arguments
        • Build SSH keys
      • Add integrations
      • View clusters and cloud integrations
      • Add datasets
      • Environment handles
    • Workflows in Release
      • Stages of workflows
      • Serial deployments
      • Parallel deployments
      • Rolling deployments
      • Rainbow deployments
    • Networking
      • Network architecture (AWS)
      • Network architecture (GCP)
      • Ingresses
      • IP addresses
      • Cloud-provided services
      • Third-party services
    • Release environment versioning
    • Application settings
      • Application Template
        • Schema definition
      • Default environment variables
      • GitHub
      • Pull requests
      • GitOps
      • Just-in-time file mounts
      • Primary App Link
      • Create application FAQ
      • App-level build arguments
      • Parameters
      • Workspaces
    • End-to-end testing
    • Environment settings
      • Environment configuration
      • Environment variables
        • Environment variable mappings
        • Secrets vaults
        • Using Secrets with GitOps
        • Kubernetes Secrets as environment variables
        • Managing legacy Release Secrets
    • Environment expiration
    • Environment presets
    • Instant datasets on AWS
    • Instant datasets on GCP
    • Instant dataset tasks
      • Tonic Cloud
      • Tonic On-Premise
    • Cloud resources
    • Static service deployment
    • Helm
      • Getting started
      • Version-controlled Helm charts
      • Open-source charts
      • Building Docker images
      • Ingress and networking
      • Configuration
    • GitOps
    • The .release.yaml file
    • Docker Compose conversion support
    • Reference examples
      • Adding and removing services
      • Managing service resources
      • Adding database containers to the Application Template
      • Stock Off-The-Shelf Examples
    • Release API
      • Account Authentication
      • Environments API
        • Create
        • Get
        • Setup
        • Patch
      • User Authentication
      • Environment Presets API
        • Get Environment Preset List
        • Get Environment Preset
        • Put Environment Preset
  • Background concepts
    • How Release works
  • Frequently asked questions
    • Release FAQ
    • AWS FAQ
    • Docker FAQ
    • JavaScript FAQ
  • Integrations
    • Integrations overview
      • Artifactory integration
      • Cloud integrations (AWS)
        • AWS guides
        • Grant access to AWS resources
        • AWS how to increase EIP quota
        • Control your EKS fleet with systems manager
        • Managing STS access
        • AWS Permissions Boundaries
        • Private ECR Repositories
        • Using an Existing AWS VPC
        • Using an Existing EKS Cluster
      • Docker Hub integration
      • LaunchDarkly integration
      • Private registries
      • Slack integration
      • Cloud integrations (GCP)
        • GCP Permissions Boundary
      • Datadog Agent
      • Doppler Secrets Manager
      • AWS Secrets Management
    • Source control integrations
      • GitHub
        • Pull request comments
        • Pull request labels
        • GitHub deployments
        • GitHub statuses
        • Remove GitHub integration
      • Bitbucket
      • GitLab
    • Monitoring and logging add-ons
      • Datadog
      • New Relic
      • ELK (Elasticsearch, Logstash, and Kibana)
  • Release Delivery
    • Create new customer integration
    • Delivery guide
    • Release to customer account access controls
    • Delivery FAQs
  • Release Instant Datasets
    • Introduction
    • Quickstart
    • Security
      • AWS Instant Dataset security
    • FAQ
    • API
  • CLI
    • Getting started
    • Installation
    • Configuration
    • CLI usage example
    • Remote development environments
    • Command reference
      • release accounts
        • release accounts list
        • release accounts select
      • release ai
        • release ai chat
        • release ai config-delete
        • release ai config-init
        • release ai config-select
        • release ai config-upsert
      • release apps
        • release apps list
        • release apps select
      • release auth
        • release auth login
        • release auth logout
      • release builds
        • release builds create
      • release clusters
        • release clusters exec
        • release clusters kubeconfig
        • release clusters shell
      • release datasets
        • release datasets list
        • release datasets refresh
      • release deploys
        • release deploys create
        • release deploys list
      • release development
        • release development logs
        • release development start
      • release environments
        • release environments config-get
        • release environments config-set
        • release environments create
        • release environments delete
        • release environments get
        • release environments list
        • release environments vars-get
      • release gitops
        • release gitops init
        • release gitops validate
      • release instances
        • release instances exec
        • release instances logs
        • release instances terminal
  • Release.ai
    • Release.ai Introduction
    • Getting Started
    • Release.ai Templates
    • Template Configuration Basics
    • Using GPU Resources
    • Custom Workflows
    • Fine Tuning LlamaX
    • Serving Inference
Powered by GitBook
On this page
  • Creating a fine tuning application
  • Configuring your fine tuning application
  • Running the job

Was this helpful?

  1. Release.ai

Fine Tuning LlamaX

Using Release.ai to fine tune Llama based models.

Training and tuning in house models is a great way to achieve improved performance and results from LLMs without resorting to techniques that slow down and increase cost of inference, such as RAG and context-stuffing.

Creating a fine tuning application

From the Applications page, select Create Application then Create from Template in the drop down. Select Runnable as this is a job that will run and complete. For this example we'll select Axolotl as it's the simplest to get working. Click Finish.

Configuring your fine tuning application

To allow your job to save models to S3, you must link the application template to your S3 bucket. First you'll need to find your bucket name, which is the name of the your cluster, plus it's Release ID, plus static-builds, separated with dashes.

To get the cluster's Release ID, click on the cluster you want to use under Configuration > Clusters and copy the last part of the URL. For our example cluster called release-ai with ID of 0ezcy7w the bucket name should be release-ai-0ezcy7w-static-builds.

At the top level of your application template, define the s3_volumes.

s3_volumes:
- bucket: release-ai-0ezcy7w-static-builds

Then under the axolotl service, add the s3 mount under the volumes array.

  - type: s3
    bucket: release-ai-0ezcy7w-static-builds
    mount_path: "/bucket"

Running the job

Now click the Run Job button. Model location or name can be either a HuggingFace model ID, or an S3 location of a model. For this example we'll use meta-llama/Meta-Llama-3-8B. Tuning dataset location should be an S3 path to a .json file containing tuning data. It should have the following format:

{"input": "...","output": "..."}

An example from our tuning data looks like:

{"input": "How are Release environments built?", "output": "Release environments are built from an Application Template using default environment variables. The Application Template defines high-level concepts like the Kubernetes cluster, domain, hostnames, and resources allocated to your services."}
{"input": "How do developers interact with Release?","output": "Developers generally interact with Release by creating a pull request in their VCS (version control system), which automatically creates an environment with the code changes in the PR. Webhooks are calls made from your VCS back to Release when certain events happen."}
{"input": "How does Release eliminate time spent managing environments?","output": "With Release, all of the management and tooling required to build a flexible environment ecosystem comes out of the box. You can create an application template using default environment variables."}

For this example we'll leave the model type as llama3 and click the Run Job button.

Once the job has finished, there will be a folder called axolotl-out-YYYY-MM-DD created in your S3 bucket.

Now you can take this folder, and load it into your favorite inference server to start sending it queries.

PreviousCustom WorkflowsNextServing Inference

Last updated 10 months ago

Was this helpful?

You can also use datasets from HuggingFace such as .

mhenrichsen/alpaca_2k_test