Dockerfiles #
A Dockerfile is a text document that contains a set of instructions used to build a Docker image. Docker images are the blueprints for containers. The Dockerfile acts as a recipe, specifying everything needed to assemble the image: the base operating system, software dependencies, environment variables, network configurations, and the application code itself.
By automating the image creation process, a Dockerfile ensure consistency across different environments, from development to production.
Here is a Dockerfile that creates an image based on postgres:17 and includes an initialization script:
FROM postgres:17
# Copy the initialization script into the container
COPY student.sql /docker-entrypoint-initdb.d/student.sql The initialisation script student.sql would be sitting right next to the Dockerfile, and its contents could be like this:
CREATE TABLE IF NOT EXISTS student (
id SERIAL PRIMARY KEY,
name VARCHAR(50)
);
INSERT INTO student (id, name)
VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Charlie'),
(4, 'David'), (5, 'Elisabeth'), (6, 'Ferdinand')
ON CONFLICT (id) DO NOTHING;And we would build an image from this Dockerfile by using the following command:
docker build --tag mypostgres:latest .Once the image has been build, you can list it using the following command:
docker imagesSQL files inside
/docker-entrypoint-initdb.d/being executed automatically upon the first startup is a feature of the specific PostgreSQL image setup made by its maintaners.
Make sure you have a clean slate before creating new containers with shared volumes. You can reuse the same volume if the version of PostgreSQL stays the same.
| Command | Purpose |
|---|---|
docker rm --force demo-postgres |
Stop and remove a container |
docker volume rm postgres-data |
Remove an unused volume |
Now that the image is built, we can run the container. If it does not exist yet, start by creating the volume:
docker volume create postgres-dataThen, run the container as we did previously:
docker run --name demo-postgres \
--env POSTGRES_PASSWORD=mypassword \
--env POSTGRES_USER=demouser \
--env POSTGRES_DB=demodb \
--publish 5432:5432 \
--volume "postgres-data:/var/lib/postgresql/data" \
--detach \
mypostgres:latestHere is a list of the most relevant instructions in a Dockerfile:
FROM: Specifies the base image for the Docker image.RUN: Executes commands in a new layer. Used to install software or make system changes.COPY: Copies files from the host machine into the Docker image.ADD: Similar toCOPY, but can also extract compressed files.WORKDIR: Sets the working directory for subsequent commands.CMD: Specifies the default command to run when the container starts.EXPOSE: Documents which ports the container will use. This is not a security feature.ENV: Sets environment variables inside the container.LABEL: Adds metadata to the image. Optional, but useful when documenting.USER: Change to this username, so commands following this instruction will be executed as this user.
A more elaborate Dockerfile to build our own custom image could look like this:
FROM postgres:17
# Extend PostgreSQL to support storing, indexing and querying geographic data
RUN apt-get update && \
apt-get install --yes --no-install-recommends postgresql-17-postgis-3 && \
apt-get clean
# Copy and extract the initialitzacion scripts into the container
ADD init.tar.gz /docker-entrypoint-initdb.d/
# Document the port PostgreSQL is listening to inside the container
EXPOSE 5432
# Document the maintainer via metadata
LABEL maintainer="[email protected]"Each RUN instruction creates a new layer (i.e., a snapshot of the image). Minimizing layers improves build performance and reduces image size, therefore consolidating multiple operations into one layer is considered a best practice, as it helps maintain the efficiency of Docker’s layer caching system.
Our init.tar.gz archive will consist of two files:
- A
city.sqlscript that will install thepostgisextension, create thecitytable and load some data into it. - A
student.sqlscript that will create thestudenttable and load some data into it.
Our student.sql file could look the same as before:
-- student.sql
CREATE TABLE IF NOT EXISTS student (
id SERIAL PRIMARY KEY,
name VARCHAR(50)
);
INSERT INTO student (id, name)
VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Charlie'),
(4, 'David'), (5, 'Elisabeth'), (6, 'Ferdinand')
ON CONFLICT (id) DO NOTHING;Our new city.sql will install and make use of the PostGIS extension we added support for:
-- city.sql
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE TABLE IF NOT EXISTS city (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
location GEOMETRY(POINT, 4326) NOT NULL
);
INSERT INTO city (name, location)
VALUES
('Barcelona', ST_GeomFromText('POINT(2.17340 41.38879)', 4326)),
('València', ST_GeomFromText('POINT(-0.37739 39.46975)', 4326)),
('Sevilla', ST_GeomFromText('POINT(-5.98376 37.38296)', 4326)),
('Palma', ST_GeomFromText('POINT(2.65024 39.56975)', 4326)),
('Maó', ST_GeomFromText('POINT(1.49675 39.65029)', 4326)),
('Ciutadella', ST_GeomFromText('POINT(4.00649 39.99439)', 4326)),
('Eivissa', ST_GeomFromText('POINT(1.44216 38.90629)', 4326))
ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name;Because we are demonstrating the ADD directive, we will need to archive the student.sql and city.sql files:
tar --create --gzip --file init.tar.gz city.sql student.sqlWe are now ready to use this Dockerfile and its associated archive to build our custom image:
docker build --tag mypostgres:latest .Make sure you are not reusing an existing volume for this new PostgreSQL container. Otherwise, your SQL files will not be executing upon first run.
And, then, run the container:
docker run --name demo-postgres \
--env POSTGRES_PASSWORD=mypassword \
--env POSTGRES_USER=demouser \
--env POSTGRES_DB=demodb \
--publish 5432:5432 \
--volume "postgres-data:/var/lib/postgresql/data" \
--detach \
mypostgres:latestOnce the container is running, we would interactively execute psql inside the container:
docker exec -it demo-postgres psql -U demouser -d demodbAnd run the following query to find the closest available city to Granada (latitude: 37.17702, longitude: -3.59825):
SELECT name, ST_Distance(location, ST_SetSRID(ST_MakePoint(-3.59825, 37.17702), 4326)) AS distance
FROM city
ORDER BY location <-> ST_SetSRID(ST_MakePoint(-3.59825, 37.17702), 4326) ASC
LIMIT 1;We would quite the interactive shell by sending the Ctrl + D command or typing the \q.