Exercise 3: Creating your own docker image

Objective

Learn how to create a docker image that you can use later

Access to the tutorial material

You can create the dockerfiles in this exercise by cutting and pasting from the screen, or, if you’ve cloned the repository, you will find them already in the tsi-cc/ResOps/scripts/docker/ subdirectory:

# Clone this documentation if you haven't already done so
> git clone https://gitlab.ebi.ac.uk/TSI/tsi-ccdoc
> cd tsi-ccdoc/tsi-cc/ResOps/scripts/docker/

Creating an image, step by step

You can modify an image interactively, as we saw in the first exercise, but that’s no sane way to build an image for re-use later. Much better is to use a Dockerfile to build it for you. Take a look at Dockerfile.01, which should have these contents:

#
# Comments start with an octothorpe, as you might expect
#
# Specify the 'base image'
FROM ubuntu:latest

#
# Naming the maintainer is good practice
LABEL Author="Your Name" Email="your@email.address"

#
# The 'LABEL' directive takes arbitrary key=value pairs
LABEL Description="This is my personal flavor of Ubuntu" Vendor="Your Name" Version="1.0"

#
# Now tell ubuntu to update itself
RUN apt-get update -y

You can have multiple RUN commands, though you should check out the Best practices for a comment about that.

You tell docker to build an image with that dockerfile by using the docker build command. We’ll give it a tag with the --tag option, and we tell it which dockerfile to use with the --file option.

We also have to give it a context to build from, so we give it the current directory .. If we add files, they will be taken relative to that context. The context can also be a URL, see https://docs.docker.com/engine/reference/commandline/build/ for full details.

# N.B. This assumes you have the USER environment variable set in your environment
> docker build --tag $USER:ubuntu --file Dockerfile.01 .
Sending build context to Docker daemon 72.19 kB
Step 1 : FROM ubuntu:latest
 ---> 4ca3a192ff2a
Step 2 : MAINTAINER Your Name "your@email.address"
 ---> Running in 051314cdc3ec
 ---> ea59cb99c816
Removing intermediate container 051314cdc3ec
Step 3 : LABEL Description "This is my personal flavor of Ubuntu" Vendor "Your Name" Version "1.0"
 ---> Running in 099b516c4bdf
 ---> 241e336f1ef1
Removing intermediate container 099b516c4bdf
Step 4 : RUN apt-get update -y
 ---> Running in 5ec72101d67b
Get:1 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB]
Get:2 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [102 kB]
Get:3 http://archive.ubuntu.com/ubuntu xenial-security InRelease [102 kB]
Get:4 http://archive.ubuntu.com/ubuntu xenial/main Sources [1103 kB]
Get:5 http://archive.ubuntu.com/ubuntu xenial/restricted Sources [5179 B]
Get:6 http://archive.ubuntu.com/ubuntu xenial/universe Sources [9802 kB]
Get:7 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages [1558 kB]
Get:8 http://archive.ubuntu.com/ubuntu xenial/restricted amd64 Packages [14.1 kB]
Get:9 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages [9827 kB]
Get:10 http://archive.ubuntu.com/ubuntu xenial-updates/main Sources [261 kB]
Get:11 http://archive.ubuntu.com/ubuntu xenial-updates/restricted Sources [1872 B]
Get:12 http://archive.ubuntu.com/ubuntu xenial-updates/universe Sources [137 kB]
Get:13 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages [548 kB]
Get:14 http://archive.ubuntu.com/ubuntu xenial-updates/restricted amd64 Packages [11.7 kB]
Get:15 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages [459 kB]
Get:16 http://archive.ubuntu.com/ubuntu xenial-security/main Sources [60.7 kB]
Get:17 http://archive.ubuntu.com/ubuntu xenial-security/restricted Sources [1872 B]
Get:18 http://archive.ubuntu.com/ubuntu xenial-security/universe Sources [15.8 kB]
Get:19 http://archive.ubuntu.com/ubuntu xenial-security/main amd64 Packages [225 kB]
Get:20 http://archive.ubuntu.com/ubuntu xenial-security/restricted amd64 Packages [11.7 kB]
Get:21 http://archive.ubuntu.com/ubuntu xenial-security/universe amd64 Packages [76.9 kB]
Fetched 24.6 MB in 14s (1721 kB/s)
Reading package lists...
 ---> 312bd6b10add
Removing intermediate container 5ec72101d67b
Successfully built 312bd6b10add

Now, you can see your image with the docker images command:

> docker images
REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE
wildish             ubuntu              312bd6b10add        About a minute ago   167.6 MB
ubuntu              latest              4ca3a192ff2a        25 hours ago         128.2 MB

Our new image is there, and it’s about 40 MB bigger than the image we started from, because of the updates we applied.

We can now run that image and check that it really is updated by trying to apply the updates again, there should be nothing new to do:

> docker run -t -i $USER:ubuntu /bin/bash
root@4989d23e6e8b:/# apt-get update -y
Hit:1 http://archive.ubuntu.com/ubuntu xenial InRelease
Hit:2 http://archive.ubuntu.com/ubuntu xenial-updates InRelease
Hit:3 http://archive.ubuntu.com/ubuntu xenial-security InRelease
Reading package lists... Done
root@4989d23e6e8b:/# exit

As expected, there’s nothing new to apply.

Inspecting an image to find out how it was built

A brief aside, if you want to find out how a container was built, you can use the docker inspect command. It gives full details as a JSON document, more than you’d normally want to know, but we can at least use it to get back the MAINTAINER and LABELS we added:

> docker inspect $USER:ubuntu | grep --after-context=6 Labels
            "Labels": {
                "Author": "Your Name",
                "Description": "This is my personal flavor of Ubuntu",
                "Email": "your@email.address",
                "Vendor": "Your Name",
                "Version": "1.0"
            }
--
            "Labels": {
                "Author": "Your Name",
                "Description": "This is my personal flavor of Ubuntu",
                "Email": "your@email.address",
                "Vendor": "Your Name",
                "Version": "1.0"
            }

Why do the LABELS we specified appear twice? I don’t know…

Adding our own programs to the image

There’s a sample Perl script, hello-user.pl in your working directory. Please take the time to make sure you understand how it works before proceeding.

Let’s tell docker to add that script to the image, so we can run it as an application.

We’ll use Dockerfile.02, which has the following content:

FROM ubuntu:latest
LABEL Author="Your Name" Email="your@email.address"

RUN apt-get update -y

#
# Set an environment variable in the container
ENV MY_NAME Tony

#
# Add our perl script
ADD hello-user.pl /app/hello.pl

You can see that we’ve set an environment variable in our image (MY_NAME) and we’ve added our script as /app/hello.pl. You can have as many ENV and ADD sections as you like, though as with the RUN section, it’s worth learning about the best practices before adding too many.

Now build the image:

> docker build --tag $USER:ubuntu --file Dockerfile.02 .
Sending build context to Docker daemon  95.74kB
Step 1/6 : FROM ubuntu:latest
 ---> 7698f282e524
Step 2/6 : LABEL Author="Your Name" Email="your@email.address"
 ---> Using cache
 ---> 4da140dc87fa
Step 3/6 : LABEL Description="This is my personal flavor of Ubuntu" Vendor="Your Name" Version="1.0"
 ---> Using cache
 ---> a6f0cc9d1234
Step 4/6 : RUN apt-get update -y
 ---> Using cache
 ---> 2f162cdbcc1e
Step 5/6 : ENV MY_NAME Tony
 ---> Running in b166b73c2eb0
Removing intermediate container b166b73c2eb0
 ---> 4d2ba043c256
Step 6/6 : ADD hello-user.pl /app/hello.pl
 ---> d83241a70a07
Successfully built d83241a70a07
Successfully tagged wildish:ubuntu

Note steps 1 through 4, where the cache was used to save time building the image. I.e. we didn’t have to build the entire image from scratch, and apply the updates again.

We’ve re-used the tag ($USER:ubuntu), so this version will replace the old one. That’s not a good idea if the image is already in use in production, of course!

Now let’s run the app in the image

> docker run -t -i --rm $USER:ubuntu /app/hello.pl
Hello Tony

What happens if we update our script, will docker be smart enough to pick up the changes? Yes, up to a point.

Let’s start by copying a new version of the script in place, and re-build the image:

> cp hello-user-with-args.pl hello-user.pl 
> docker build --tag $USER:ubuntu --file Dockerfile.02 .
Sending build context to Docker daemon 81.92 kB
Step 1 : FROM ubuntu:latest
 ---> 4ca3a192ff2a
Step 2 : MAINTAINER Your Name "your@email.address"
 ---> Using cache
 ---> ea59cb99c816
Step 3 : LABEL Description "This is my personal flavor of Ubuntu" Vendor "Your Name" Version "1.0"
 ---> Using cache
 ---> 241e336f1ef1
Step 4 : RUN apt-get update -y
 ---> Using cache
 ---> 312bd6b10add
Step 5 : ENV MY_NAME Tony
 ---> Using cache
 ---> 0857feeb7bb0
Step 6 : ADD hello-user.pl /app/hello.pl
 ---> ae442bdee840
Removing intermediate container 5fe5c7d58e0d
Successfully built ae442bdee840

Step 6 didn’t use the cache, because docker noticed the script had been updated. However, if the script itself hadn’t changed, but modules or libraries that it uses have changed, docker wouldn’t be able to pick that up on its own. Put differently, the build process can’t ‘see through’ commands like apt-get update -y to know that there are changes since it was last run.

In case you want to, you can force a re-build from the start by telling docker not to use the cache:

> docker build --no-cache --tag $USER:ubuntu --file Dockerfile.02 .
[...]

Passing arguments to an application in an image

Can we change who it says hello to? Yes, we can! We can set environment variables in the container before the application runs by using the --env flag with docker run:

> docker run -t -i --rm --env MY_NAME=Whoever $USER:ubuntu /app/hello.pl
Hello Whoever

The new version uses the environment variable MY_NAME by default, as before, but also allows you to override that by giving command-line options. To do that, simply append the arguments to the end of the docker run command:

> docker run --rm -ti $USER:ubuntu /app/hello.pl someone
Hello someone

Running an application by default

Finally, let’s try getting our application to run by default, so we don’t have to remember the path to it whenever we want to run it. Dockerfile.03 shows how to do that

FROM ubuntu:latest
LABEL Author="Your Name" Email="your@email.address"

#
# The 'LABEL' directive takes arbitrary key=value pairs
LABEL Description="This is my personal flavor of Ubuntu" Vendor="Your Name" Version="1.0"

#
# Now tell ubuntu to update itself
RUN apt-get update -y

#
# Set an environment variable in the container
ENV MY_NAME Tony
ADD hello-user.pl /app/hello.pl

#
# Specify the command to run!
CMD /app/hello.pl

So, build it, then run it:

> docker build --tag $USER:ubuntu --file Dockerfile.03 .
[...]
> docker run --rm -ti $USER:ubuntu
Hello Tony

Optimizing builds

We saw that docker build --no-cache ... solves the problem of docker not knowing if something was updated, but doing everything from scratch can be a bit expensive. The obvious solution is to build intermediate images, and move the more stable stuff into the earlier images. Take a look at Dockerfile.04.base and Dockerfile.04.app, they’re just Dockerfile.03 split into two parts:

> cat Dockerfile.04.base 
FROM ubuntu:latest
LABEL Author="Your Name" Email="your@email.address"

RUN apt-get update -y

> cat Dockerfile.04.app 
FROM wildish:ubuntu

ENV MY_NAME Tony
ADD hello-user.pl /app/hello.pl

CMD /app/hello.pl

Dockerfile.04.base builds an updated ubuntu image, while Dockerfile.04.app uses that image as its base. As long as we tag the base image as $USER:ubuntu, and refer to it correctly in the FROM statement for the app, the app will find it correctly. We can’t use the environment variable in the FROM statement for the app, so we have to hard-code the user name there. Change it to your own user name before building the image.

Note also that Dockerfile.04.app doesn’t have a MAINTAINER or LABEL section, which means it will inherit them from the base image.

Now we can build our app in two stages:

> docker build --tag $USER:ubuntu --file Dockerfile.04.base .
Sending build context to Docker daemon 86.53 kB
Step 1 : FROM ubuntu:latest
 ---> 4ca3a192ff2a
Step 2 : MAINTAINER Your Name "your@email.address"
 ---> Using cache
 ---> 223050aea37e
Step 3 : LABEL Description "This is my personal flavor of Ubuntu" Vendor "Your Name" Version "1.0"
 ---> Using cache
 ---> c03ba3b7afd5
Step 4 : RUN apt-get update -y
 ---> Using cache
 ---> 00269c0edb02
Successfully built 00269c0edb02

> docker build --tag $USER:hello --file Dockerfile.04.app .
Sending build context to Docker daemon 86.53 kB
Step 1 : FROM wildish:ubuntu
 ---> 00269c0edb02
Step 2 : ENV MY_NAME Tony
 ---> Using cache
 ---> 0fa5ba428fe0
Step 3 : ADD hello-user.pl /app/hello.pl
 ---> Using cache
 ---> 704d3c5941c6
Step 4 : CMD /app/hello.pl
 ---> Using cache
 ---> ce37fdc3bd4e
Successfully built ce37fdc3bd4e

> docker run --rm -ti $USER:hello
Hello Tony

If we force a rebuild of the app, it’s very quick now, because it doesn’t have to update the base ubuntu operating system.

Docker now supports a ‘docker builder’ pattern, which formalises this multi-step build approach. See https://docs.docker.com/develop/develop-images/multistage-build/ for more details, as well as exercise 5.

Conclusion

You can now build your own images, starting from a base image, updating it, adding files, specifying environment variables, and specifying the default executable to run.

You know how to tag your images, so they have a meaningful name, and you know how to specify useful metadata that you can retrieve programatically.

Best practices

  • avoid building big images, start from the lightest base you can manage and only add what you really need
  • move any stable, heavy parts of your build early in the Dockerfile, to maximize the benefit of the cache
  • consider using intermediate builds, to further isolate stable parts from volatile parts if you need to force builds
  • follow the official best-practices guide, at https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/