## Exercise 3: Creating your own docker image ## ### Objective ### Learn how to create a docker image that you can use later ### Access to the tutorial material ### You can create the dockerfiles in this exercise by cutting and pasting from the screen, or, if you've cloned the repository, you will find them already in the **tsi-cc/ResOps/scripts/docker/** subdirectory: ``` # Clone this documentation if you haven't already done so > git clone https://gitlab.ebi.ac.uk/TSI/tsi-ccdoc.git > cd tsi-ccdoc/tsi-cc/ResOps/scripts/docker/ ``` ### Creating an image, step by step ### You can modify an image interactively, as we saw in the first exercise, but that's no sane way to build an image for re-use later. Much better is to use a **Dockerfile** to build it for you. Take a look at **Dockerfile.01**, which should have these contents: ``` # # Comments start with an octothorpe, as you might expect # # Specify the 'base image' FROM ubuntu:latest # # Naming the maintainer is good practice LABEL Author="Your Name" Email="your@email.address" # # The 'LABEL' directive takes arbitrary key=value pairs LABEL Description="This is my personal flavor of Ubuntu" Vendor="Your Name" Version="1.0" # # Now tell ubuntu to update itself RUN apt-get update -y ``` You can have multiple **RUN** commands, though you should check out the **Best practices** for a comment about that. You tell docker to build an image with that dockerfile by using the **docker build** command. We'll give it a **tag** with the `--tag` option, and we tell it which dockerfile to use with the `--file` option. We also have to give it a _context_ to build from, so we give it the current directory `.`. If we add files, they will be taken relative to that context. The context can also be a URL, see [https://docs.docker.com/engine/reference/commandline/build/](https://docs.docker.com/engine/reference/commandline/build/) for full details. ``` # N.B. This assumes you have the USER environment variable set in your environment > docker build --tag $USER:ubuntu --file Dockerfile.01 . Sending build context to Docker daemon 72.19 kB Step 1 : FROM ubuntu:latest ---> 4ca3a192ff2a Step 2 : MAINTAINER Your Name "your@email.address" ---> Running in 051314cdc3ec ---> ea59cb99c816 Removing intermediate container 051314cdc3ec Step 3 : LABEL Description "This is my personal flavor of Ubuntu" Vendor "Your Name" Version "1.0" ---> Running in 099b516c4bdf ---> 241e336f1ef1 Removing intermediate container 099b516c4bdf Step 4 : RUN apt-get update -y ---> Running in 5ec72101d67b Get:1 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB] Get:2 http://archive.ubuntu.com/ubuntu xenial-updates InRelease [102 kB] Get:3 http://archive.ubuntu.com/ubuntu xenial-security InRelease [102 kB] Get:4 http://archive.ubuntu.com/ubuntu xenial/main Sources [1103 kB] Get:5 http://archive.ubuntu.com/ubuntu xenial/restricted Sources [5179 B] Get:6 http://archive.ubuntu.com/ubuntu xenial/universe Sources [9802 kB] Get:7 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages [1558 kB] Get:8 http://archive.ubuntu.com/ubuntu xenial/restricted amd64 Packages [14.1 kB] Get:9 http://archive.ubuntu.com/ubuntu xenial/universe amd64 Packages [9827 kB] Get:10 http://archive.ubuntu.com/ubuntu xenial-updates/main Sources [261 kB] Get:11 http://archive.ubuntu.com/ubuntu xenial-updates/restricted Sources [1872 B] Get:12 http://archive.ubuntu.com/ubuntu xenial-updates/universe Sources [137 kB] Get:13 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages [548 kB] Get:14 http://archive.ubuntu.com/ubuntu xenial-updates/restricted amd64 Packages [11.7 kB] Get:15 http://archive.ubuntu.com/ubuntu xenial-updates/universe amd64 Packages [459 kB] Get:16 http://archive.ubuntu.com/ubuntu xenial-security/main Sources [60.7 kB] Get:17 http://archive.ubuntu.com/ubuntu xenial-security/restricted Sources [1872 B] Get:18 http://archive.ubuntu.com/ubuntu xenial-security/universe Sources [15.8 kB] Get:19 http://archive.ubuntu.com/ubuntu xenial-security/main amd64 Packages [225 kB] Get:20 http://archive.ubuntu.com/ubuntu xenial-security/restricted amd64 Packages [11.7 kB] Get:21 http://archive.ubuntu.com/ubuntu xenial-security/universe amd64 Packages [76.9 kB] Fetched 24.6 MB in 14s (1721 kB/s) Reading package lists... ---> 312bd6b10add Removing intermediate container 5ec72101d67b Successfully built 312bd6b10add ``` Now, you can see your image with the **docker images** command: ``` > docker images REPOSITORY TAG IMAGE ID CREATED SIZE wildish ubuntu 312bd6b10add About a minute ago 167.6 MB ubuntu latest 4ca3a192ff2a 25 hours ago 128.2 MB ``` Our new image is there, and it's about 40 MB bigger than the image we started from, because of the updates we applied. We can now run that image and check that it really is updated by trying to apply the updates again, there should be nothing new to do: ``` > docker run -t -i $USER:ubuntu /bin/bash root@4989d23e6e8b:/# apt-get update -y Hit:1 http://archive.ubuntu.com/ubuntu xenial InRelease Hit:2 http://archive.ubuntu.com/ubuntu xenial-updates InRelease Hit:3 http://archive.ubuntu.com/ubuntu xenial-security InRelease Reading package lists... Done root@4989d23e6e8b:/# exit ``` As expected, there's nothing new to apply. ### Inspecting an image to find out how it was built ### A brief aside, if you want to find out how a container was built, you can use the **docker inspect** command. It gives full details as a JSON document, more than you'd normally want to know, but we can at least use it to get back the **MAINTAINER** and **LABELS** we added: ``` > docker inspect $USER:ubuntu | grep --after-context=6 Labels "Labels": { "Author": "Your Name", "Description": "This is my personal flavor of Ubuntu", "Email": "your@email.address", "Vendor": "Your Name", "Version": "1.0" } -- "Labels": { "Author": "Your Name", "Description": "This is my personal flavor of Ubuntu", "Email": "your@email.address", "Vendor": "Your Name", "Version": "1.0" } ``` Why do the **LABELS** we specified appear twice? I don't know... ### Adding our own programs to the image ### There's a sample Perl script, **hello-user.pl** in your working directory. Please take the time to make sure you understand how it works before proceeding. Let's tell docker to add that script to the image, so we can run it as an application. We'll use **Dockerfile.02**, which has the following content: ``` FROM ubuntu:latest LABEL Author="Your Name" Email="your@email.address" RUN apt-get update -y # # Set an environment variable in the container ENV MY_NAME Tony # # Add our perl script ADD hello-user.pl /app/hello.pl ``` You can see that we've set an environment variable in our image (**MY_NAME**) and we've added our script as **/app/hello.pl**. You can have as many **ENV** and **ADD** sections as you like, though as with the **RUN** section, it's worth learning about the best practices before adding too many. Now build the image: ``` > docker build --tag $USER:ubuntu --file Dockerfile.02 . Sending build context to Docker daemon 95.74kB Step 1/6 : FROM ubuntu:latest ---> 7698f282e524 Step 2/6 : LABEL Author="Your Name" Email="your@email.address" ---> Using cache ---> 4da140dc87fa Step 3/6 : LABEL Description="This is my personal flavor of Ubuntu" Vendor="Your Name" Version="1.0" ---> Using cache ---> a6f0cc9d1234 Step 4/6 : RUN apt-get update -y ---> Using cache ---> 2f162cdbcc1e Step 5/6 : ENV MY_NAME Tony ---> Running in b166b73c2eb0 Removing intermediate container b166b73c2eb0 ---> 4d2ba043c256 Step 6/6 : ADD hello-user.pl /app/hello.pl ---> d83241a70a07 Successfully built d83241a70a07 Successfully tagged wildish:ubuntu ``` Note steps 1 through 4, where the cache was used to save time building the image. I.e. we didn't have to build the entire image from scratch, and apply the updates again. We've re-used the tag (```$USER:ubuntu```), so this version will replace the old one. That's not a good idea if the image is already in use in production, of course! Now let's run the app in the image ``` > docker run -t -i --rm $USER:ubuntu /app/hello.pl Hello Tony ``` What happens if we update our script, will docker be smart enough to pick up the changes? Yes, up to a point. Let's start by copying a new version of the script in place, and re-build the image: ``` > cp hello-user-with-args.pl hello-user.pl > docker build --tag $USER:ubuntu --file Dockerfile.02 . Sending build context to Docker daemon 81.92 kB Step 1 : FROM ubuntu:latest ---> 4ca3a192ff2a Step 2 : MAINTAINER Your Name "your@email.address" ---> Using cache ---> ea59cb99c816 Step 3 : LABEL Description "This is my personal flavor of Ubuntu" Vendor "Your Name" Version "1.0" ---> Using cache ---> 241e336f1ef1 Step 4 : RUN apt-get update -y ---> Using cache ---> 312bd6b10add Step 5 : ENV MY_NAME Tony ---> Using cache ---> 0857feeb7bb0 Step 6 : ADD hello-user.pl /app/hello.pl ---> ae442bdee840 Removing intermediate container 5fe5c7d58e0d Successfully built ae442bdee840 ``` Step 6 didn't use the cache, because docker noticed the script had been updated. However, if the script itself hadn't changed, but modules or libraries that it uses have changed, docker wouldn't be able to pick that up on its own. Put differently, the build process can't 'see through' commands like **apt-get update -y** to know that there are changes since it was last run. In case you want to, you can force a re-build from the start by telling docker not to use the cache: ``` > docker build --no-cache --tag $USER:ubuntu --file Dockerfile.02 . [...] ``` ### Passing arguments to an application in an image ### Can we change who it says hello to? Yes, we can! We can set environment variables in the container before the application runs by using the ```--env``` flag with **docker run**: ``` > docker run -t -i --rm --env MY_NAME=Whoever $USER:ubuntu /app/hello.pl Hello Whoever ``` The new version uses the environment variable **MY_NAME** by default, as before, but also allows you to override that by giving command-line options. To do that, simply append the arguments to the end of the **docker run** command: ``` > docker run --rm -ti $USER:ubuntu /app/hello.pl someone Hello someone ``` ### Running an application by default ### Finally, let's try getting our application to run by default, so we don't have to remember the path to it whenever we want to run it. **Dockerfile.03** shows how to do that ``` FROM ubuntu:latest LABEL Author="Your Name" Email="your@email.address" # # The 'LABEL' directive takes arbitrary key=value pairs LABEL Description="This is my personal flavor of Ubuntu" Vendor="Your Name" Version="1.0" # # Now tell ubuntu to update itself RUN apt-get update -y # # Set an environment variable in the container ENV MY_NAME Tony ADD hello-user.pl /app/hello.pl # # Specify the command to run! CMD /app/hello.pl ``` So, build it, then run it: ``` > docker build --tag $USER:ubuntu --file Dockerfile.03 . [...] > docker run --rm -ti $USER:ubuntu Hello Tony ``` ### Optimizing builds ### We saw that `docker build --no-cache ...` solves the problem of docker not knowing if something was updated, but doing _everything_ from scratch can be a bit expensive. The obvious solution is to build intermediate images, and move the more stable stuff into the earlier images. Take a look at **Dockerfile.04.base** and **Dockerfile.04.app**, they're just **Dockerfile.03** split into two parts: ``` > cat Dockerfile.04.base FROM ubuntu:latest LABEL Author="Your Name" Email="your@email.address" RUN apt-get update -y > cat Dockerfile.04.app FROM wildish:ubuntu ENV MY_NAME Tony ADD hello-user.pl /app/hello.pl CMD /app/hello.pl ``` **Dockerfile.04.base** builds an updated ubuntu image, while **Dockerfile.04.app** uses _that_ image as its base. As long as we tag the base image as **$USER:ubuntu**, and refer to it correctly in the **FROM** statement for the app, the app will find it correctly. We can't use the environment variable in the **FROM** statement for the app, so we have to hard-code the user name there. Change it to your own user name before building the image. Note also that **Dockerfile.04.app** doesn't have a **MAINTAINER** or **LABEL** section, which means it will inherit them from the base image. Now we can build our app in two stages: ``` > docker build --tag $USER:ubuntu --file Dockerfile.04.base . Sending build context to Docker daemon 86.53 kB Step 1 : FROM ubuntu:latest ---> 4ca3a192ff2a Step 2 : MAINTAINER Your Name "your@email.address" ---> Using cache ---> 223050aea37e Step 3 : LABEL Description "This is my personal flavor of Ubuntu" Vendor "Your Name" Version "1.0" ---> Using cache ---> c03ba3b7afd5 Step 4 : RUN apt-get update -y ---> Using cache ---> 00269c0edb02 Successfully built 00269c0edb02 > docker build --tag $USER:hello --file Dockerfile.04.app . Sending build context to Docker daemon 86.53 kB Step 1 : FROM wildish:ubuntu ---> 00269c0edb02 Step 2 : ENV MY_NAME Tony ---> Using cache ---> 0fa5ba428fe0 Step 3 : ADD hello-user.pl /app/hello.pl ---> Using cache ---> 704d3c5941c6 Step 4 : CMD /app/hello.pl ---> Using cache ---> ce37fdc3bd4e Successfully built ce37fdc3bd4e > docker run --rm -ti $USER:hello Hello Tony ``` If we force a rebuild of the app, it's very quick now, because it doesn't have to update the base ubuntu operating system. Docker now supports a 'docker builder' pattern, which formalises this multi-step build approach. See [https://docs.docker.com/develop/develop-images/multistage-build/](https://docs.docker.com/develop/develop-images/multistage-build/) for more details, as well as exercise 5. ### Conclusion ### You can now build your own images, starting from a base image, updating it, adding files, specifying environment variables, and specifying the default executable to run. You know how to tag your images, so they have a meaningful name, and you know how to specify useful metadata that you can retrieve programatically. ### Best practices ### - avoid building big images, start from the lightest base you can manage and only add what you really need - move any stable, heavy parts of your build early in the Dockerfile, to maximize the benefit of the cache - consider using intermediate builds, to further isolate stable parts from volatile parts if you need to force builds - follow the official best-practices guide, at [https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/](https://docs.docker.com/engine/userguide/eng-image/dockerfile_best-practices/)