Container-based data

Container-based data is simply the act of creating a container that has the data pre-baked into it, generally loaded into the same database technology that you use in production.

There are a lot of ways to model this, but most databases that have publicly available containers (like MySQL and Postgres) have documentation on where to add files to the container to have them automatically imported on startup.

An example for MySQL would be a Dockerfile that looks like this:

FROM mysql
COPY backup.sql /docker-entrypoint-initdb.d/backup.sql

This container, built in Release, will automatically load backup.sql into the database when it starts up.

A more sophisticated model for larger datasets would include a custom startup script using Docker's COPY in the container that runs aws s3 cp to download and restore a large backup.

We generally recommend [Instant Datasets](../reference-guide/instant-datasets.md) over baking data into a container, due to the improved ability to keep this data updated. With a container-based approach, you will have to generate your `.sql` file and commit it to the repository on a regular basis.

Last updated