Practical activity

Duration1h15

Virtualization – The Docker example

Introduction

Let’s try to go further with our digital environment…

Okay, virtualization is another IT buzzword, but not only. In computing, many things can be virtualized: computers, operating systems, applications, networks, etc. This word can cover a lot of different things. When you hear about virtualization, the first thing you should ask yourself is: are we talking about the same thing? For us, it’s a practical solution for running applications or computer systems in a controlled environment.

Virtualization can be motivated by several factors: reproducibility, developing, testing, packaging, security. To work, software must be installed on an appropriate system, with all the libraries it depends on, using the right versions, fully configured, etc. So, the idea is to freeze all this (the application, the system it runs on, the libraries that go with it, the configuration, etc.) into something ready to use usually called an “image”. Thus packaged, it can be easily distributed, tested under the same conditions, allowing different variants to be explored without side effects, etc. The other idea is that these images are relatively bounded. Theire I/O are managed by the virtualization system, which can then be used to control the isolation of the images from each other and from the rest of the world. This provides a means of security.

A few vocabulary items:

  • The guest is the system or software to be virtualized.
  • The host is the real system which actually run the virtualized system.
  • The hypervisor is the software which implements the virtualization solution. The host runs the hypervisor which runs the guest.

Wikipeda provides an overview of the different ways to proceed, depending on what does the solution actually virtualizes.

For instance, some solutions propose to virtualize the hardware of a computer (CPU, RAM, disk, display, Ethernet card, etc.) Eg. VirtualBox, VMware, QEMU, etc. This is rather heavy (a host, like a personal computer, usually can’t run more than one or two guests), but offers a high versatility (guests can be very different from the host and are nothing more than a typical complete PC system, which can also be heavy on disk space). Some of these solutions offer offloading techniques to use the host’s hardware and thus gain in performance.

But here, in this session, we look at a lighter solution: containerization. The idea is to virtualize the operating system (or parts of it). This comes from the Linux ecosystem. Running applications are normally associated with an operating system context which references and control the resources used (files, users, drivers, etc.). This mechanism has evolved into a means (named cgroups) of providing a context for applications to see something possibly different from the host operating system. Thus was born the principle of containers. This virtualization solution is well suited to running one or a set of applications (and not an entire system, even if that is possible). It’s very lightweight. We can mention solutions such as LXC, Docker, Podman, Singularity, etc.

Even though it’s very close to Linux, Docker can run on Windows and Mac, thanks to a few tricks.

Install Docker Desktop

Just a few words before we begin. Docker, Inc. is an American company which proposes services (Docker Hub) and software (Docker Engine, Docker Desktop). According to the terms of use, their software can be installed and used free of charge for personal use and students. Access to the services is also possible free of charge, but requires the creation of an account and the sharing of certain personal information. But we won’t need it here.

  • Docker Hub: among other things, the cloud repository offered by the company with numerous ready-to-use container images.
  • Docker Engine: the historical solution dedicated to Linux hosts. The hypervisor is a system application (running with root permissions) responsible for running containers on demand of users. Note that this architecture is one of the criticized security points. (Under certain special circumstances, a user can elevate to root permissions.)
  • Docker Desktop: the new solution put forward by the company, capable of working on Windows, Mac and now Linux, thanks to a few tricks. On Windows, it uses the existing WSL2 layer (Windows Subsystem for Linux), somehow a Window virtualization solution to run Linux systems. On Mac, it uses QEMU, the versatile virtualization solution mentioned above with many offloading techniques. And now on Linux, it also uses QEMU, and thus benefits from a new layer of isolation, which ensures security. As we can see, the idea is to stack things: one hardware virtualization on the host which run several containers. This mixture of the two virtualization techniques mentioned in the introduction offers a homogeneous solution on different systems, while remaining quite effective overall.

Please note that teachers have no financial interest in this company nor in any of the alternatives. We declare no conflict of interest. ;-) Docker was one of the very first solutions to do containers. Technical criticisms exist (some alternatives may be more secure or more efficient, while being more complex). Many resources about Docker are available on the web. Also note that some of the alternatives offer compatibility with Docker. This is why your teachers selected Docker for this session.

Before jumping too quickly on downloading Docker Desktop, please see the tips below, depending on your system:

Read the doc. The large amount of technical detail is certainly interesting, but the goal here is to get how to do the following tasks:

  • As mentioned earlier, Docker Desktop needs WSL2 to be activated. It is a prerequisite. (Well, it can also use the old Hyper-V, but let’s forget that here.) WSL2 is probably already activated on your PC, but if not, retrieve and follow instructions.
  • WSL2 itself need hardware virtualization to be enabled in BIOS. It is also probably already activated on your PC. Otherwise, you will have to restart your PC, go to the BIOS menu at startup, retrieve the corresponding option and activate it. This step depends on the vendor and model of your PC.
  • Then, retrieve how to download the Docker Desktop Installer.exe and execute it.

Once done, you can start the Docker Desktop (search on the menu).

For some historical reasons, the process installs both the traditional Docker Engine plus the new Docker Desktop. Simply put, Docker Engine runs into the QEMU configured by Docker Desktop.

(As a side effect, when you stop the Docker Desktop service on your PC, you fall back on the Docker Engine service running natively on the host system of your PC, as in the plain old times. Low-level Docker control commands are the same, except that if Docker Desktop is running you are talking to containers images managed in it, and if the Docker Desktop is stopped you are talking to plain old containers images managed outside. The commands are the same, they always work, except you don’t manage the same containers.)

This can lead to confusion. It’s better to be warned.

Now, read the doc. The large amount of technical detail is certainly interesting, but the goal here is to get how to do the following three tasks (the variants depend on your version of Linux):

  • Set up Docker’s package repository corresponding to the version of your Linux system (need root’s permissions).
  • Download the Docker Desktop package, which (surprisingly) is currently outside the package repository and must be downloaded manually (root permissions are not required). Know where your browser stored the downloaded file.
  • Launch the installation of the downloaded Docker Desktop package using the appropriate command for your Linux version (need root’s permissions). This should also install the necessary dependencies, eg. Docker Engine packages, QEMU, KVM and others.

Once done, you can start the Docker Desktop (search on the Applications menu).

  1. The installation complains about some missing package dependencies: You probably missed the first step above. Check the documentation again and get instructions on how to set up something called a “repository” for “.deb” or “.rpm” packages for your Linux version. Re-run your installation tool.

  2. In certain (rare) circumstances, users must be manually added to the “docker” group to be authorized to run images. (Note that we’re talking here about “users” and “groups” according to Unix concepts.)
    $ sudo adduser $USER docker
    This step is mandatory with the Docker Engine solution, but usually not with the Docker Desktop solution. But in case…

Read the doc. The large amount of technical detail is certainly interesting, but the goal is to get how to:

  • Retrieve the Docker Desktop installer corresponding to your Mac variant
  • Download it
  • Execute it

Those who are curious can look at how to do it on other systems…

Experiment a container

Now that Docker is installed, let’s use it. Remember, the aim is to run certain applications, even complex ones, without having to install them natively on the host system, but in a container. For the purposes of this exercise, your teachers have chosen nginx, a web server.

Main steps of the proposed experiment:

  1. obtain a relevant Docker image
  2. prepare the necessary things to conduct the experiment (in our case, a few static web pages)
  3. run the selected image, with configuration to use the prepared stuff
  4. enjoy

Steps not proposed by the experiment:

  • play with command line tools
  • build a new docker image
  • compose several docker images to set up a containerized IT infrastructure
  • publish a new docker image
  • run images located on a distant server
  • etc.

Retrieve an image

Start Docker Desktop. At the very top there is a search bar; type “nginx”. You can get a hundred answers. There is an official Docker’ build simply named “nginx”, plus many others with various names. Many brave community members have created images incorporating this app with variants. Some are rather outdated, others are actively maintained.

So, at the top of the list you will see “nginx”. This is the official Docker build. Click on this name to get information about this image and some guidance on how to use it.

Okay, you are convinced, click on Pull (top right button). (You will Run it later.)

Prepare stuff

nginx is a very versatile web server. For a simple test, one static HTML page is enough.

Prepare a subdirectory somewhere in your workspace on your PC. Choose a consistent location to organize your work throughout the training sessions… Remember the full path of this subdirectory. Let’s call it my_website for the following. (Choose another name if you prefer, but keep it in mind and adjust the following accordingly.)

Open your favorite text editor and create the file my_website/index.html with this piece of html. (i.e. the file named index.html, that you put in your subdirectory named my_website)

<html>
<body>
<p>Hello world!</p>
</body>
</html>

Please don’t waste time coding fancy HTML! … Or later…

Configure and run the container

On the left side of the Docker Desktop window, select Images from the menu. (Feel free to explore other menu options. But let’s continue with the Images.)

For now, there is only one image: nginx. Click the associated triangle symbol on the right (called Run).

A pop-up window asks for some Optional settings. Expand this menu. We are primarily interested in network ports and volumes. (Other options can be explored another time.)

  • The nginx image expose the TCP port number 80. This is the standardized port for the good old HTTP service. (Do not confuse with HTTPS, port 443, which requires configuration and certificates to provide a layer of security. Outside the scope of this session, despite its interest.)

    You need to choose which of the TCP ports on your PC should be mapped to TCP port 80 on the container. You can keep 80, the official number. However, since you’re doing an experiment and not deploying a real web server, it’s probably safer to choose an unofficial one. (Honestly, there’s no real risk here.) Why not 8888. Please also note that you cannot use a port already used by another application at the same time.

    Fill out the form with the chosen port. (Keep it in mind!)

  • The nginx application offers different configuration points. This takes the form of files or directories that the application accesses inside the container. To handle this, Docker can be asked to export the files from the host to the container file space.
    These are what we call volumes in Docker vocabulary. For our experiment, we want to export the subdirectory with the html pages we have prepared, so that they can be served by the nginx server.

    So, configure a volume exporting my_website towards /usr/share/nginx/html (the file space where nginx is configured to find its html pages).

Then click on Run

Now, you get a new Container. It’s probably the only one of the list (see Containers from the menu of the Docker Desktop window). A container is an instance of an image. This means you can have multiple containers based on the same image, possibly with different configurations. You can Start / Stop / Delete a container.

For now, you should have one running nginx container.

In case you missed a step, you can delete a container and start again.

Note that an image is read-only, whereas a container is not (unless Docker is asked to do so). Files are organized on stacked “layers”. A container has one read-only layer which is the image, plus read-write layers where running application can store their files. Each time you re-run a container, a new layer is created.

Enjoy your container

At this stage you should have a container running containing the nginx web server, serving the HTML page you wrote.

So, test it with your favorite web browser, on the url http://localhost:8888/

Hooray!

Command line tools

Docker Desktop offers a fine graphical interface. However, it is rather classic to use Docker from the command line. So, open your favorite shell tool (Powershell, Unix shell, etc.) and guess what the following commands do:

$ docker pull ngnix
$ docker images
$ docker run -d --name my_nginx -p 8888:80 -v my_website:/usr/share/nginx/html nginx
$ docker container ls
$ docker stop my_nginx

Build a new docker image

Building its own custom docker image usually consist in:

  • select an existing image close to what you want to make
  • install a few additional applications in it, if needed
  • run some commands in it, if needed
  • copy or edit certain files in it, if needed
  • instruct docker of the exposed network ports, if needed
  • instruct docker of the entry point or command to launch while running the container

This takes the form of a “Dockerfile”, i.e. a file actually named Dockerfile (without any extension), with the corresponding commands. Take a look at the documentation for the list of keywords. You can also take a look at the nginx Dockerfile used in this session.

  1. So, let’s create a “Dockerfile” with:

    FROM nginx
    COPY my_website/index.html /usr/share/nginx/html/index.html

    Guess what each line does. Guess why there is no EXPOSE nor ENTRYPOINT.

  2. Then, run the following command in your shell. (Sorry, the Docker Desktop GUI can’t help you.)

    $ docker build -t webserver .

Now, Docker Desktop proposes a new image named “webserver”, which embeds its own html pages, and which only need a port to be configured for run. (Also, try docker images.)

If you have a Docker account, you can share your creation on the Docker hub. (See docker login; docker tag; docker push.) Otherwise, you can distribute the image file on your own. (See docker save / docker load.)