Template Configuration Basics
This documentation provides an overview of a Release AI template configuration, focusing on how to utilize parameters as user inputs, select GPU resources, and run specific jobs like "pulling models in Ollama" and "running data ingestion" for vector databases.
Configuration Overview
The configuration file defines various services, resources, and workflows necessary to set up and manage an AI environment. Here’s a breakdown of the key components:
Hostnames
The hostnames section specifies the domain naming convention for different services based on the environment ID (env_id
) and the domain (domain
). Note that hostnames require a service with node ports.
Default Resources
Specifies the default CPU and memory limits and requests, as well as the number of replicas for scalability. These are applied to all services and jobs defined in your template. You can override these values on a per service or job basis.
Ingress
Configures the ingress controller to manage incoming HTTP requests for uploading documents.
Shared Volumes
Defines shared storage volumes for persistent data storage.
Parameters
Parameters are user inputs that can be dynamically configured during deployment.
Parameters in the configuration allow for dynamic user inputs during the deployment. This flexibility ensures the environment can be tailored to specific requirements without modifying the core configuration file. For instance, the llm
and embedding_model
parameters can be set to different values based on the desired models and embeddings.
Services
Defines the different services that make up the AI stack, including their configurations, environment variables, and resource requests. The example Ollama service configuration includes a persistent volume and mount point to store downloaded models. This ensures that models are retained even if the service is restarted.
GPU Resources
To select GPU resources, use the node_selector
field to specify nodes with GPU capabilities:
Running Jobs
Jobs are defined to perform specific tasks, such as pulling models or ingesting data. An example job configuration for pulling models with Ollama:
Workflows
Workflows orchestrate the sequence of tasks to be executed, ensuring dependencies are respected. The setup
, patch
, and teardown
workflows are used for environment lifecycle. You can also define custom workflows which can be kicked off via the CLI or manually via the user interface.
Example setup workflow:
Example custom workflow:
Last updated