Using GPU Resources
Kubernetes, the backbone of Release AI, provides mechanisms to allocate GPUs to your applications. This is crucial for running compute-intensive tasks such as machine learning model training, deep learning inference, and high-performance data processing.
Example Configurations for GPU Usage
Selecting GPUs for Your Service
To start, you need to specify that your service requires GPUs. This can be done by setting the nvidia.com/gpu limits. Below is an example configuration:
In this example:
The
mlapp
service is configured to run with an image that leverages CUDA.The
replicas
field is set to4
, meaning four instances of this service will run.The
nvidia_com_gpu.limits
is set to1
, specifying that each instance will use one GPU.The
node_selector
ensures that the service is scheduled on nodes with GPU resources.
Selecting an Instance Type Known to Have GPUs
Choosing the right instance type is important for your application to access the necessary GPU resources. Here’s how you can specify an instance type with GPUs:
In this example:
The
nvidia.com/gpu.product
key is set toA10G
, specifying that the nodes should have NVIDIA A10G GPUs.The
beta.kubernetes.io/instance-type
key is set tog5.12xlarge
, which is a known instance type that includes GPU resources.
Complete Configuration Example
Combining the GPU selection and instance type configuration, here is the complete setup:
In this example:
The
services
section defines the application services in your deployment.The
mlapp
service runs a Docker image optimized for CUDA operations.The
replicas
field specifies the number of instances to run.The
nvidia_com_gpu.limits
ensures each instance gets2
GPU.The
node_selector
filters the nodes to only those with the specified GPU (A10G
) and instance type (g5.12xlarge
).
Last updated