Getting with Docker

Share This Class:

Table of Contents

Overview

The container should be short-lived

By Dockerfilecontainer constructed mirroring start should be as short (short life cycle). « Short-term » means that the container can be stopped and destroyed, and the setup and configuration effort required to create a new container and deploy it should be minimal.

Use .dockerignorefile

Use Dockerfileconstructing the image is preferably Dockerfileplaced in a new empty directory. Then add the files needed to build the image to the directory. In order to improve the efficiency of constructing a mirror, you can create a new directory in the .dockerignorefile to specify files and directories to ignore. .dockerignoreGit grammar and excluded pattern file .gitignoresimilar files.

Use multi-stage build

In the Docker 17.05above versions, you can use the multi-stage build to reduce the size of image building.

Avoid installing unnecessary packages

In order to reduce complexity, reduce dependencies, reduce file size, and save build time, you should avoid installing any unnecessary packages. For example, don’t include a text editor in database mirroring.

A container runs only one process

It should be ensured that only one process runs in a container. Decoupling multiple applications into different containers ensures the horizontal expansion and reuse of containers. For example, a web application should contain three containers: web application, database, and cache.

If the containers depend on each other, you can use a Docker custom network to connect these containers.

As few mirror layers as possible

You need to Dockerfilereadability (including long-term maintainability) and reduce the number of layers to make a balance between.

Sort multiple rows of parameters

Sort multiple lines of parameters in alphabetical order (for example, when you want to install multiple packages). This can help you avoid including the same package repeatedly, and it’s easier to update the package list. But also easy to PRsread and review. It recommended backslash add a space before, in order to increase readability.

The following is from the buildpack-depsexample of the mirror:

1 
2
3
4
5
6
RUN apt-get update && apt-get install -y 
bzr
cvs
git
mercurial
subversion

Build the cache

During the construction of the mirror, Docker traverses Dockerfilefile command, and then executed in order. Before executing each instruction, Docker will look for a reusable image in the cache, and if there is a reusable image, it will use the existing image instead of creating it again. If you do not want to use the cache in the build process, you can docker builduse the command --no-cache=trueoption.

However, if you want to use the cache during the build process, you have to understand when you will and when you will not find a matching mirror. The basic rules are as follows:

  • Starting from a base image ( FROMinstruction specified), the next instruction will match all sub-mirrors of the base image, and check whether the instructions used when these sub-mirrors are created are exactly the same as the instructions being checked. If it is not, the cache is invalidated.
  • In most cases, simply compare Dockerfileinstructions and submirror. However, some instructions require more inspection and explanation.
  • For ADDand COPYinstructions, image files corresponding to the content may also be checked, every file and a checksum is calculated. The last modification time and last access time of the file will not be included in the verification. In the cache search process, these checksums are compared with the file checksums in the mirror. If there are any changes to the file, such as content and metadata, the cache is invalidated.
  • In addition to ADDand COPYinstruction cache matching process does not view the files in the temporary container to determine whether the cache match. For example, when executing the RUN apt-get -y updateinstruction the vessel is updated files, but these files are not checked Docker. In this case, only the instruction string itself is used to match the cache.

Upon a cache miss, all subsequent Dockerfileinstructions will generate a new image, the buffer will not be used.

Dockerfile instructions

Below for Dockerfilegiven recommends that the best way to prepare various instructions.

FROM

If possible, use the current official repositories as the basis for your mirroring. It is recommended to use the Alpine image because it is strictly controlled and maintains a minimum size (currently less than 5 MB), but it is still a complete release.

LABEL

You can add tags to images to help organize images, record license information, and assist in automated construction. Each label row, by the LABELbeginning of adding the one or more tag pairs. The following example shows the various possible formats. #The first line is the comment content.

Note: If your string contains spaces, you must put the string in quotation marks or use escapes for spaces. If the string content itself contains quotation marks, the quotation marks must be escaped.

1 
2
3
4
5
6
7
8
# Set one or more individual labels 
LABEL com.example.version= "0.0.1-beta"

 

LABEL vendor= « ACME Incorporated »

LABEL com.example.release-date= « 2015-02-12 »

LABEL com.example.version.is-production= «  »

A plurality of image labels can include, but is recommended that a tag into a plurality of LABELinstruction.

1 
2
3
4
5
6
# Set multiple labels at once, using line-continuation characters to break long lines 
LABEL vendor=ACME Incorporated
com.example.is-beta=
com.example.is-production= ""
com.example.version= "0.0.1- beta"
com.example.release-date= "2015-02-12"

For the key-value pairs that can be accepted by labels, refer to Understanding object labels. For querying label information, refer to Managing labels on objects.

RUN

In order to maintain Dockerfilethe readability of the document, understandability, and maintainability, it is recommended long or complex RUNinstructions with a backslash is divided into a plurality of rows.

apt-get

RUNThe most common usage of instructions is to install packages apt-get. Because RUN apt-getinstruction packages will be installed, so there are several issues that need attention.

Do not use RUN apt-get upgradeor dist-upgradeas many base image of « must » package does not upgrade a non-privileged container. If a package in the base image is out of date, you should contact its maintainer. If you determine whether a particular package, for example foo, need to be upgraded to use apt-get install -y fooon the line, the command will automatically upgrade foopackage.

Will always RUN apt-get updateand apt-get installcombined into a RUNstatement, such as:

1 
2
3
4
RUN apt-get update && apt-get install -y 
package-bar
package-baz
package-foo

Will be apt-get updateplaced in a separate RUNstatement caching can cause problems and subsequent apt-get installfailure. For example, suppose you have a Dockerfilefile:

1 
2
3
4
5
FROM ubuntu: 18.04

 

RUN apt-get update

RUN apt-get install -y curl

After the image is built, all the layers are in Docker’s cache. Suppose you later modified them apt-get installto add a package:

1 
2
3
4
5
FROM ubuntu: 18.04

 

RUN apt-get update

RUN apt-get install -y curl nginx

Docker found that the modified RUN apt-get updateinstructions and exactly the same as before. Therefore, it apt-get updatewill not be executed, but the previous cache image will be used. Because apt-get updatethere is no running back apt-get installmight install an outdated curland nginxversion.

Use RUN apt-get update && apt-get install -ycan ensure that the latest version of your Dockerfiles are each installation package, and this process does not require additional coding or further intervention. This technique is called cache busting. You can also display the version number of a specified package to achieve cache-bustingthis. This is the so-called fixed version, for example:

1 
2
3
4
RUN apt-get update && apt-get install -y 
package-bar
package-baz
package-foo=1.3.*

A fixed version will force the build process to retrieve a specific version regardless of what is in the cache. This technology can also reduce failures caused by unexpected changes in the required packages.

The following is a RUNsample template command, showing all about apt-getrecommendations.

1 
2
3
4
5
6
7
8
9
10
11
12
13
14
RUN apt-get update && apt-get install -y 
aufs-tools
automake
build-essential
curl
dpkg-sig
libcap-dev
libsqlite3-dev
mercurial
reprepro
ruby1.9.1
ruby1.9.1-dev
s3cmd=1.1.*
&& rm -rf /var /lib/apt/lists/*

Where the s3cmdinstruction specifies a version number 1.1.*. If the mirror before using the older version, designated the new version will lead to apt-get udpatedefeat caching and ensure the installation of the new version.

Further, apt cache clean out var/lib/apt/listscan be reduced image size. Since RUNthe beginning of the instruction is apt-get udpate, you will always be in the package cache apt-get installrefresh before.

Note: The official Debian and Ubuntu images will automatically run apt-get clean, so there is no need to call apt-get clean explicitly.

CMD

CMDInstructions are used to execute the software contained in the target image and can include parameters. CMDIt should be based on the most CMD ["executable", "param1", "param2"...]used form. Therefore, if you create mirrored purpose is to deploy a service (for example Apache), you may perform similar to the CMD ["apache2", "-DFOREGROUND"]form of the command. We recommend that any service mirrors use this form of command.

In most cases, CMDan interactive one shell(bash, Python, perl, etc.) is required, for example CMD ["perl", "-de0"], or CMD ["PHP", "-a"]. Use this form means that when you perform a similar docker run -it pythontime, you will enter a ready shellin. CMDIn order to be in the rare cases CMD ["param", "param"]in the form of ENTRYPOINTco-use, unless you and your users are mirrored to ENTRYPOINTwork very familiar with.

EXPOSE

EXPOSEThe directive is used to specify the port that the container will listen on. Therefore, you should use common ports for your applications. For example, to provide Apacheweb services should be used mirrors EXPOSE 80, providing MongoDBservices mirroring EXPOSE 27017.

For external access, users can perform docker runusing a flag to indicate how the specified port mapping to the selected port.

ENV

To facilitate the new program is running, you can use ENVto update the program installed in the container PATHenvironment variables. For example, ENV PATH /usr/local/nginx/bin:$PATHto ensure CMD ["nginx"]correct operation.

ENVDirectives can also be used to provide necessary environment variables for services you want to containerize, such as those required by Postgres PGDATA.

Finally, ENVit can also be used to set common version numbers, such as the following example:

1 
2
3
4
5
6
7
ENV PG_MAJOR 9.3

 

ENV PG_VERSION 9.3 . 4

RUN curl -SL http://example.com/postgres- $PG_VERSION .tar.xz | tar -xJC /usr/src/postgress &&…

ENV PATH /usr/ local /postgres- $PG_MAJOR /bin: $PATH

Similar to the program constants, this method allows you to simply change the ENVcommand to automatically change the software version of the container.

ADD and COPY

Although ADDand COPYfunctionally similar, but generally prefer to use COPY. Because it than ADDmore transparent. COPYSupports only simple copy local files to the container, but ADDthere are some less obvious features (such as local and remote tar extract URL support). Therefore, ADDthe best use case is to automatically extract the local tar file into the image, for example ADD rootfs.tar.xz.

If you Dockerfilehave multiple steps required to use the context of different files. Separate COPYeach file, rather than one-off COPYall of the files, which will ensure that each step of building failure in the cache only when a specific file changes. E.g:

1 
2
3
4
5
COPY requirements.txt /tmp/

 

RUN pip install –requirement /tmp/requirements.txt

COPY. /Tmp/

If COPY . /tmp/placed RUNbefore the instruction, as long as .any change in a file directory, a subsequent instruction will cause the cache failure.

To make a mirror as small as possible, it is best not to use ADDcommands from a remote URL to obtain the package, but the use of curland wget. This way you can delete files that are no longer needed after the files are extracted to avoid adding an extra layer to the image. For example, try to avoid the following usage:

1 
2
3
4
5
ADD http://example.com/big.tar.xz /usr/src/things/

 

RUN tar -xJf /usr/src/things/big.tar.xz -C /usr/src/things

RUN make -C /usr/src/things all

Instead, the following method should be used:

1 
2
3
4
RUN mkdir -p /usr/src/things 
&& curl -SL http://example.com/big.tar.xz
| tar -xJC /usr/src/things
&& make -C /usr/src/things all

The pipeline operation used above, so there is no intermediate file to delete.

For other unwanted ADDfiles or directories automatically extracted features, you should use COPY.

ENTRYPOINT

ENTRYPOINTThe best use is to set the main command of the mirror, the mirror will be allowed to run as a command itself (by CMDproviding the default option).

For example, the following example image provides command-line tools s3cmd:

1 
2
3
ENTRYPOINT [ "s3cmd" ]

 

CMD [ « –help » ]

Now the container created by running the image directly will display command help:

1
$ docker run s3cmd

Or provide the correct parameters to execute a command:

1
$ docker run s3cmd ls s3://mybucket

In this way, the image name can be used as a reference for the command line.

ENTRYPOINT Instructions can also be used in conjunction with an auxiliary script, similar to the previous command line style, even if more than one step is required to start the tool.

For example, the Postgresofficial mirror uses the following script as ENTRYPOINT:

1 
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/bin/bash 
set -e

 

if [  » $1 «  = « postgres » ]; then
chown -R postgres  » $PGDATA « 

if [-z  » $(ls -A « $PGDATA ») «  ]; then
gosu postgres initdb fi

exec gosu postgres  » $@  »
fi

exec  » $@ « 

Note: This script uses Bash’s built-in command exec, so the last process to run is the process with the container’s PID 1. In this way, the process can receive any Unix signal sent to the container.

The script is copied into the auxiliary container, when the container starts by ENTRYPOINTperforming:

1 
2
3
COPY ./docker-entrypoint.sh /

 

ENTRYPOINT [ « /docker-entrypoint.sh » ]

The script allows users to use several different ways and Postgresinteraction.

You can start it very simply Postgres:

1
$ docker run postgres

It can also be performed Postgresand the arguments:

1
$ docker run postgres postgres - help

Finally, you can start another completely different tool, such as Bash:

1
$ docker run --rm -it postgres bash

VOLUME

VOLUMEDirectives are used to expose any database storage files, configuration files, or files and directories created by the container. It is strongly recommended to use VOLUMEpart of the user and to manage the variable portion of the mirror can be changed.

USER

If a service is not required privileges to perform, it is recommended to use USERthe instruction to switch to a non-root user. First in Dockerfileuse in similar RUN groupadd -r postgres && useradd -r -g postgres postgrescreate user and group instruction.

Note: In mirroring, the UID/GID assigned to users and user groups each time is uncertain. The UID/GID assigned to the next time the mirror is rebuilt may be different. If you want to rely on a certain UID/GID, you should explicitly specify a UID/GID.

You should avoid using sudoit because its unpredictable TTY and signal forwarding behavior may cause more problems than it can solve. If you really need and sudosimilar functions (for example, initiating a daemon with root permissions to non-root permission to perform it), you can use gosu.

Finally, in order to reduce the complexity and number of layers, avoiding the use of frequent USERswitching back and forth the user.

WORKDIR

For clarity and reliability, you should always WORKDIRuse an absolute path. In addition, you should use WORKDIRinstead of similar RUN cd ... && do-somethinginstructions, which are difficult to read, troubleshooting and maintenance.