Workspaces

Overview

Release Workspaces provide multiple containers within your environment or runnable job with access to a common filesystem. This filesystem can be made up of both regular shared files that can be created and modified by your containers or mounts of external resources such as cloud storage buckets (S3, GCS, etc.) or Git repositories.

Defining Workspaces

Single Auto-Attached Workspace

In most cases all you'll need is a single workspace per environment that you'll want automatically attached to all of your containers.

The recommended basic configuration is:

workspaces:
  - name: default
    auto_attach: true
    path: /workspace

By setting auto_attach: true, the workspace will be automatically attached to every job and service defined in your Application Template and accessible from /workspace within the container's filesytem.

Multiple Workspaces (Advanced)

In more advanced cases you may want to have multiple workspaces and control which jobs and services have access to which workspaces.

Here's an example configuration:

workspaces:
  - name: frontend
    path: /assets
  - name: backend
    path: /assets

services:
  - name: frontend
    image: frontend-server
    workspaces:
      - frontend

  - name: backend
    image: backend-server
    workspaces:
      - backend

jobs:
  - name: build-frontend-assets
    image: build-frontend
    workspaces:
      - frontend

  - name: build-backend-assets
    image: build-backend
    workspaces:
      - backend

In this example, both the backend-related server and build job and the frontend-related server and build job each have access to a shared /assets directories but because they are in different workspaces, they're completely isolated filesystems.

Use Cases

Sharing files between jobs

A common use case is when a pipeline of jobs all need to work on files produced by previous jobs in the pipeline. Workspaces can make this simple by providing a known location to read/write files.

Example:

workspaces:
  - name: default
    auto_attach: true
    path: /workspace

jobs:
  # Downloads a model and writes it to /workspace/model
  - name: download
    image: download-model

  # Finetunes the model at /workspace/model and writes it to /workspace/model-finetuned
  - name: finetune
    image: finetune-model

services:
  # Runs an inference server for the finetuned model from /workspace/model-finetuned
  - name: serve
    image: serve-model

workflows:
  - name: setup
    parallelize:
      - step: download
        tasks:
          - jobs.download
      - step: finetune
        tasks:
          - jobs.finetune
      - step: serve
        tasks:
          - jobs.serve

Mounting Git repositories

To make the contents of a Git repository available to one or more of your containers you can use Workspace Mounts

Example:

workspaces:
  - name: default
    auto_attach: true
    path: /workspace
    mounts:
      - path: my-repo
        source_url: https://github.com/org/repo#branch

services:
  # Runs a notebook with the code checked out at /workspaces/my-repo
  - name: notebook
    image: notebook

Note that writes to a Git repository mount are allowed but will be only visible locally. The container must commit and push to make the changes accessible remotely.

To dynamically mount a Git repository based on user input, see Parameters.

Mounting cloud storage buckets

To make the contents of a cloud storage bucket available to one or more of your containers you can use Workspace Mounts

Example:

workspaces:
  - name: default
    auto_attach: true
    path: /workspace
    mounts:
      - path: my-training-data
        source_url: s3://my-bucket/my/prefix/training-data

jobs:
  # Runs a training job with the training data file available at
  # /workspaces/my-training-data
  - name: training
    image: training

To set S3 mount options, you can add query string parameters to your source_url. The query string parameters correspond directly to the Mountpoint S3 configuration flags.

Explicitly setting the bucket region

workspaces:
  - name: default
    auto_attach: true
    path: /workspace
    mounts:
      - path: my-training-data
        source_url: s3://my-bucket/my/prefix/training-data?region=us-west-2

Mounting a public S3 bucket

To mount a public S3 bucket, disable request signing and set the bucket region. For example, to mount this public dataset of Southern California Earthquake Data you'd specify the source_url as:

workspaces:
- name: default
  auto_attach: true
  path: /workspace
  mounts:
  - source_url: s3://scedc-pds?no-sign-request&region=us-west-2
    path: /sc-earthquake-data

To dynamically mount a cloud storage bucket or file based on user input or a file upload, see Parameters.

Last updated