Learning Docker

Fri Apr 15 05:33:38 UTC 2022

Learning Docker

Introduction

Docker is an open source project that allows one to pack, ship, and run any application as a lightweight container. An analogy of Docker containers are shipping containers, which provide a standard and consistent way of shipping just about anything. The container includes everything that is needed for an application to run including the code, system tools, and the necessary dependencies. If you wanted to test an application, all you need to do is to download the Docker image and run it in a new container. No more compiling and installing missing dependencies!

The overview at https://docs.docker.com/ provides more information. For more a more hands-on approach, check out know Enough Docker to be Dangerous and this short workshop that I prepared for BioC Asia 2019.

This README was generated by GitHub Actions using the R Markdown file readme.Rmd, which was executed via the create_readme.sh script.

Installing the Docker Engine

To get started, you will need to install the Docker Engine; check out this guide.

Checking your installation

To see if everything is working, try to obtain the Docker version.

docker --version

## Docker version 20.10.14+azure-1, build a224086349269551becacce16e5842ceeb2a98d6

And run the hello-world image. (The --rm parameter is used to automatically remove the container when it exits.)

docker run --rm hello-world

## Unable to find image 'hello-world:latest' locally
## latest: Pulling from library/hello-world
## 2db29710123e: Pulling fs layer
## 2db29710123e: Download complete
## 2db29710123e: Pull complete
## Digest: sha256:10d7d58d5ebd2a652f4d93fdd86da8f265f5318c6a73cc5b6a9798ff6d2b2e67
## Status: Downloaded newer image for hello-world:latest
## 
## Hello from Docker!
## This message shows that your installation appears to be working correctly.
## 
## To generate this message, Docker took the following steps:
##  1. The Docker client contacted the Docker daemon.
##  2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
##     (amd64)
##  3. The Docker daemon created a new container from that image which runs the
##     executable that produces the output you are currently reading.
##  4. The Docker daemon streamed that output to the Docker client, which sent it
##     to your terminal.
## 
## To try something more ambitious, you can run an Ubuntu container with:
##  $ docker run -it ubuntu bash
## 
## Share images, automate workflows, and more with a free Docker ID:
##  https://hub.docker.com/
## 
## For more examples and ideas, visit:
##  https://docs.docker.com/get-started/

Basics

The two guides linked in the introduction section provide some information on the basic commands but I’ll include some here as well. One of the main reasons I use Docker is for building tools. For this purpose, I use Docker like a virtual machine, where I can install whatever I want. This is important because I can do my testing in an isolated environment and not worry about affecting the main server. I like to use Ubuntu because it’s a popular Linux distribution and therefore whenever I run into a problem, chances are higher that someone else has had the same problem, asked a question on a forum, and received a solution.

Before we can run Ubuntu using Docker, we need an image. We can obtain an Ubuntu image from the official Ubuntu image repository from Docker Hub by running docker pull.

docker pull ubuntu:18.04

## 18.04: Pulling from library/ubuntu
## Digest: sha256:982d72c16416b09ffd2f71aa381f761422085eda1379dc66b668653607969e38
## Status: Image is up to date for ubuntu:18.04
## docker.io/library/ubuntu:18.04

To run Ubuntu using Docker, we use docker run.

docker run --rm ubuntu:18.04 cat /etc/os-release

## NAME="Ubuntu"
## VERSION="18.04.6 LTS (Bionic Beaver)"
## ID=ubuntu
## ID_LIKE=debian
## PRETTY_NAME="Ubuntu 18.04.6 LTS"
## VERSION_ID="18.04"
## HOME_URL="https://www.ubuntu.com/"
## SUPPORT_URL="https://help.ubuntu.com/"
## BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
## PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
## VERSION_CODENAME=bionic
## UBUNTU_CODENAME=bionic

You can work interactively with the Ubuntu image by specifying the -it option.

docker run --rm -it ubuntu:18:04 /bin/bash

You may have noticed that I keep using the --rm option, which removes the container once you quit. If you don’t use this option, the container is saved up until the point that you exit; all changes you made, files you created, etc. are saved. Why am I deleting all my changes? Because there is a better (and more reproducible) way to make changes to the system and that is by using a Dockerfile.

Start containers automatically

When hosting a service using Docker (such as running RStudio Server), it would be nice if the container automatically starts up again when the server (and Docker) restarts. If you use --restart flag with docker run, Docker will restart your container when your container has exited or when Docker restarts. The value of the --restart flag can be the following:

no - do not automatically restart (default)
on-failure[:max-retries] - restarts if it exits due to an error (non-zero exit code) and the number of attempts is limited using the max-retries option
always - always restarts the container; if it is manually stopped, it is restarted only when the Docker daemon restarts (or when the container is manually restarted)
unless-stopped - similar to always but when the container is stopped, it is not restarted even after the Docker daemon restarts.

docker run -d \
   --restart always \
   -p 8888:8787 \
   -e PASSWORD=password \
   -e USERID=$(id -u) \
   -e GROUPID=$(id -g) \
   rocker/rstudio:4.1.2

Dockerfile

A Dockerfile is a text file that contains instructions for building Docker images. A Dockerfile adheres to a specific format and set of instructions, which you can find at Dockerfile reference. There is also a Best practices guide for writing Dockerfiles.

I have an example Dockerfile that uses the Ubuntu 18.04 image to build BWA, a popular short read alignment tool used in bioinformatics.

cat Dockerfile

## FROM ubuntu:18.04
## 
## MAINTAINER Dave Tang <[email protected]>
## 
## LABEL source="https://github.com/davetang/learning_docker/blob/main/Dockerfile"
## 
## RUN apt-get clean all && \
##     apt-get update && \
##     apt-get upgrade -y && \
##     apt-get install -y \
##      build-essential \
##      wget \
##      zlib1g-dev && \
##     apt-get clean all && \
##     apt-get purge && \
##     rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
## 
## RUN mkdir /src && \
##     cd /src && \
##     wget https://github.com/lh3/bwa/releases/download/v0.7.17/bwa-0.7.17.tar.bz2 && \
##     tar xjf bwa-0.7.17.tar.bz2 && \
##     cd bwa-0.7.17 && \
##     make && \
##     mv bwa /usr/local/bin && \
##     cd && rm -rf /src
## 
## WORKDIR /work
## 
## CMD ["bwa"]

ARG

To define variables in your Dockerfile use ARG name=value. For example, you can use ARG to create a new variable that stores a version number of a program. When a new version of the program is released, you can simply change the ARG and re-build your Dockerfile.

ARG star_ver=2.7.10a
RUN cd /usr/src && \
    wget https://github.com/alexdobin/STAR/archive/refs/tags/${star_ver}.tar.gz && \
    tar xzf ${star_ver}.tar.gz && \
    rm ${star_ver}.tar.gz && \
    cd STAR-${star_ver}/source && \
    make STAR && \
    cd /usr/local/bin && \
    ln -s /usr/src/STAR-${star_ver}/source/STAR .

CMD

The CMD instruction in a Dockerfile does not execute anything at build time but specifies the intended command for the image; there can only be one CMD instruction in a Dockerfile and if you list more than one CMD then only the last CMD will take effect. The main purpose of a CMD is to provide defaults for an executing container.

ENTRYPOINT

An ENTRYPOINT allows you to configure a container that will run as an executable. ENTRYPOINT has two forms:

ENTRYPOINT [“executable”, “param1”, “param2”] (exec form, preferred)
ENTRYPOINT command param1 param2 (shell form)

FROM ubuntu
ENTRYPOINT ["top", "-b"]
CMD ["-c"]

Use --entrypoint to override ENTRYPOINT instruction.

docker run --entrypoint

Building an image

Use the build subcommand to build Docker images and use the -f parameter if your Dockerfile is named as something else otherwise Docker will look for a file named Dockerfile. The period at the end, tells Docker to look in the current directory.

cat build.sh

## #!/usr/bin/env bash
## 
## set -euo pipefail
## 
## ver=0.7.17
## 
## docker build -t davetang/bwa:${ver} .

You can push the built image to Docker Hub if you have an account. I have used my Docker Hub account name to name my Docker image.

# use -f to specify the Dockerfile to use
# the period indicates that the Dockerfile is in the current directory
docker build -f Dockerfile.base -t davetang/base .

# log into Docker Hub
docker login

# push to Docker Hub
docker push davetang/base

Renaming an image

Use docker image tag.

docker image tag old_image_name:latest new_image_name:latest

Running an image

Docker run documentation.

docker run --rm davetang/bwa:0.7.17

## Unable to find image 'davetang/bwa:0.7.17' locally
## 0.7.17: Pulling from davetang/bwa
## feac53061382: Pulling fs layer
## 549f86662946: Pulling fs layer
## 5f22362f8660: Pulling fs layer
## 3836f06c7ac7: Pulling fs layer
## 3836f06c7ac7: Waiting
## 5f22362f8660: Verifying Checksum
## 5f22362f8660: Download complete
## 3836f06c7ac7: Verifying Checksum
## 3836f06c7ac7: Download complete
## feac53061382: Verifying Checksum
## feac53061382: Download complete
## feac53061382: Pull complete
## 549f86662946: Verifying Checksum
## 549f86662946: Download complete
## 549f86662946: Pull complete
## 5f22362f8660: Pull complete
## 3836f06c7ac7: Pull complete
## Digest: sha256:f0da4e206f549ed8c08f5558b111cb45677c4de6a3dc0f2f0569c648e8b27fc5
## Status: Downloaded newer image for davetang/bwa:0.7.17
## 
## Program: bwa (alignment via Burrows-Wheeler transformation)
## Version: 0.7.17-r1188
## Contact: Heng Li <[email protected]>
## 
## Usage:   bwa <command> [options]
## 
## Command: index         index sequences in the FASTA format
##          mem           BWA-MEM algorithm
##          fastmap       identify super-maximal exact matches
##          pemerge       merge overlapping paired ends (EXPERIMENTAL)
##          aln           gapped/ungapped alignment
##          samse         generate alignment (single ended)
##          sampe         generate alignment (paired ended)
##          bwasw         BWA-SW for long queries
## 
##          shm           manage indices in shared memory
##          fa2pac        convert FASTA to PAC format
##          pac2bwt       generate BWT from PAC
##          pac2bwtgen    alternative algorithm for generating BWT
##          bwtupdate     update .bwt to the new format
##          bwt2sa        generate SA from BWT and Occ
## 
## Note: To use BWA, you need to first index the genome with `bwa index'.
##       There are three alignment algorithms in BWA: `mem', `bwasw', and
##       `aln/samse/sampe'. If you are not sure which to use, try `bwa mem'
##       first. Please `man ./bwa.1' for the manual.

Resource usage

To restrict CPU usage use --cpus=n and use --memory= to restrict the maximum amount of memory the container can use.

We can confirm the limited CPU usage by running an endless while loop and using docker stats to confirm the CPU usage. Remember to use docker stop to stop the container after confirming the usage!

Restrict to 1 CPU.

# run in detached mode
docker run --rm -d --cpus=1 davetang/bwa:0.7.17 perl -le 'while(1){ }'

# check stats and use control+c to exit
docker stats
CONTAINER ID   NAME             CPU %     MEM USAGE / LIMIT   MEM %     NET I/O     BLOCK I/O   PIDS
8cc20bcfa4f4   vigorous_khorana   100.59%   572KiB / 1.941GiB   0.03%     736B / 0B   0B / 0B     1

docker stop 8cc20bcfa4f4

Restrict to 1/2 CPU.

# run in detached mode
docker run --rm -d --cpus=0.5 davetang/bwa:0.7.17 perl -le 'while(1){ }'

# check stats and use control+c to exit
docker stats

CONTAINER ID   NAME             CPU %     MEM USAGE / LIMIT   MEM %     NET I/O     BLOCK I/O   PIDS
af6e812a94da   unruffled_liskov   50.49%    584KiB / 1.941GiB   0.03%     736B / 0B   0B / 0B     1

docker stop af6e812a94da

Copying files between host and container

Use docker cp but I recommend mounting a volume to a Docker container (see next section).

docker cp --help

Usage:  docker cp [OPTIONS] CONTAINER:SRC_PATH DEST_PATH|-
        docker cp [OPTIONS] SRC_PATH|- CONTAINER:DEST_PATH

Copy files/folders between a container and the local filesystem

Options:
  -L, --follow-link   Always follow symbol link in SRC_PATH
      --help          Print usage

# find container name
docker ps -a

# create file to transfer
echo hi > hi.txt

docker cp hi.txt fee424ef6bf0:/root/

# start container
docker start -ai fee424ef6bf0

# inside container
cat /root/hi.txt 
hi

# create file inside container
echo bye > /root/bye.txt
exit

# transfer file from container to host
docker cp fee424ef6bf0:/root/bye.txt .

cat bye.txt 
bye

Sharing between host and container

Use the -v flag to mount directories to a container so that you can share files between the host and container.

In the example below, I am mounting data from the current directory (using the Unix command pwd) to /work in the container. I am working from the root directory of this GitHub repository, which contains the data directory.

ls data

## README.md
## chrI.fa.gz

Any output written to /work inside the container, will be accessible inside data on the host. The command below will create BWA index files for data/chrI.fa.gz.

docker run --rm -v $(pwd)/data:/work davetang/bwa:0.7.17 bwa index chrI.fa.gz

## [bwa_index] Pack FASTA... 0.20 sec
## [bwa_index] Construct BWT for the packed sequence...
## [bwa_index] 3.55 seconds elapse.
## [bwa_index] Update BWT... 0.08 sec
## [bwa_index] Pack forward-only FASTA... 0.14 sec
## [bwa_index] Construct SA from BWT and Occ... 1.69 sec
## [main] Version: 0.7.17-r1188
## [main] CMD: bwa index chrI.fa.gz
## [main] Real time: 5.735 sec; CPU: 5.697 sec

We can see the newly created index files.

ls -lrt data

## total 30436
## -rw-r--r-- 1 runner docker      194 Apr 15 05:26 README.md
## -rw-r--r-- 1 runner docker  4772981 Apr 15 05:26 chrI.fa.gz
## -rw-r--r-- 1 root   root   15072516 Apr 15 05:33 chrI.fa.gz.bwt
## -rw-r--r-- 1 root   root    3768110 Apr 15 05:33 chrI.fa.gz.pac
## -rw-r--r-- 1 root   root         41 Apr 15 05:33 chrI.fa.gz.ann
## -rw-r--r-- 1 root   root         13 Apr 15 05:33 chrI.fa.gz.amb
## -rw-r--r-- 1 root   root    7536272 Apr 15 05:33 chrI.fa.gz.sa

File permissions

On newer version of Docker, you no longer have to worry about this. However, if you find that the file created inside your container on a mounted volume are owned by root, read on.

The files created inside the Docker container will be owned by root; inside the Docker container, you are root and the files you produce will have root permissions.

ls -lrt
total 2816
-rw-r--r-- 1 1211 1211 1000015 Apr 27 02:00 ref.fa
-rw-r--r-- 1 1211 1211   21478 Apr 27 02:00 l100_n100_d400_31_2.fq
-rw-r--r-- 1 1211 1211   21478 Apr 27 02:00 l100_n100_d400_31_1.fq
-rw-r--r-- 1 1211 1211     119 Apr 27 02:01 run.sh
-rw-r--r-- 1 root root 1000072 Apr 27 02:03 ref.fa.bwt
-rw-r--r-- 1 root root  250002 Apr 27 02:03 ref.fa.pac
-rw-r--r-- 1 root root      40 Apr 27 02:03 ref.fa.ann
-rw-r--r-- 1 root root      12 Apr 27 02:03 ref.fa.amb
-rw-r--r-- 1 root root  500056 Apr 27 02:03 ref.fa.sa
-rw-r--r-- 1 root root   56824 Apr 27 02:04 aln.sam

This is problematic because when you’re back in the host environment, you can’t modify these files. To circumvent this, create a user that matches the host user by passing three environmental variables from the host to the container.

docker run -it \
-v ~/my_data:/data \
-e MYUID=`id -u` \
-e MYGID=`id -g` \
-e ME=`whoami` \
bwa /bin/bash

Use the steps below to create an identical user inside the container.

adduser --quiet --home /home/san/$ME --no-create-home --gecos "" --shell /bin/bash --disabled-password $ME

# optional: give yourself admin privileges
echo "%$ME ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers

# update the IDs to those passed into Docker via environment variable
sed -i -e "s/1000:1000/$MYUID:$MYGID/g" /etc/passwd
sed -i -e "s/$ME:x:1000/$ME:x:$MYGID/" /etc/group

# su - as the user
exec su - $ME

# run BWA again, after you have deleted the old files as root
bwa index ref.fa
bwa mem ref.fa l100_n100_d400_31_1.fq l100_n100_d400_31_2.fq > aln.sam

# check output
ls -lrt
total 2816
-rw-r--r-- 1 dtang dtang 1000015 Apr 27 02:00 ref.fa
-rw-r--r-- 1 dtang dtang   21478 Apr 27 02:00 l100_n100_d400_31_2.fq
-rw-r--r-- 1 dtang dtang   21478 Apr 27 02:00 l100_n100_d400_31_1.fq
-rw-r--r-- 1 dtang dtang     119 Apr 27 02:01 run.sh
-rw-rw-r-- 1 dtang dtang 1000072 Apr 27 02:12 ref.fa.bwt
-rw-rw-r-- 1 dtang dtang  250002 Apr 27 02:12 ref.fa.pac
-rw-rw-r-- 1 dtang dtang      40 Apr 27 02:12 ref.fa.ann
-rw-rw-r-- 1 dtang dtang      12 Apr 27 02:12 ref.fa.amb
-rw-rw-r-- 1 dtang dtang  500056 Apr 27 02:12 ref.fa.sa
-rw-rw-r-- 1 dtang dtang   56824 Apr 27 02:12 aln.sam

# exit container
exit

The files will be saved in ~/my_data on the host.

ls -lrt ~/my_data
total 2816
-rw-r--r-- 1 dtang dtang 1000015 Apr 27 10:00 ref.fa
-rw-r--r-- 1 dtang dtang   21478 Apr 27 10:00 l100_n100_d400_31_2.fq
-rw-r--r-- 1 dtang dtang   21478 Apr 27 10:00 l100_n100_d400_31_1.fq
-rw-r--r-- 1 dtang dtang     119 Apr 27 10:01 run.sh
-rw-rw-r-- 1 dtang dtang 1000072 Apr 27 10:12 ref.fa.bwt
-rw-rw-r-- 1 dtang dtang  250002 Apr 27 10:12 ref.fa.pac
-rw-rw-r-- 1 dtang dtang      40 Apr 27 10:12 ref.fa.ann
-rw-rw-r-- 1 dtang dtang      12 Apr 27 10:12 ref.fa.amb
-rw-rw-r-- 1 dtang dtang  500056 Apr 27 10:12 ref.fa.sa
-rw-rw-r-- 1 dtang dtang   56824 Apr 27 10:12 aln.sam

File Permissions 2

An easier way to set file permissions is to use the -u parameter.

# assuming blah.fa exists in /local/data/
docker run -v /local/data:/data -u `stat -c "%u:%g" /local/data` bwa bwa index /data/blah.fa

Read only

To mount a volume but with read-only permissions, append :ro at the end.

docker run --rm -v $(pwd):/work:ro davetang/bwa:0.7.17 touch test.txt

## touch: cannot touch 'test.txt': Read-only file system

Removing the image

Use docker rmi to remove an image. You will need to remove any stopped containers first before you can remove an image. Use docker ps -a to find stopped containers and docker rm to remove these containers.

Let’s pull the busybox image.

docker pull busybox

## Using default tag: latest
## latest: Pulling from library/busybox
## 50e8d59317eb: Pulling fs layer
## 50e8d59317eb: Verifying Checksum
## 50e8d59317eb: Download complete
## 50e8d59317eb: Pull complete
## Digest: sha256:d2b53584f580310186df7a2055ce3ff83cc0df6caacf1e3489bff8cf5d0af5d8
## Status: Downloaded newer image for busybox:latest
## docker.io/library/busybox:latest

Check out busybox.

docker images busybox

## REPOSITORY   TAG       IMAGE ID       CREATED        SIZE
## busybox      latest    1a80408de790   27 hours ago   1.24MB

Remove busybox.

docker rmi busybox

## Untagged: busybox:latest
## Untagged: busybox@sha256:d2b53584f580310186df7a2055ce3ff83cc0df6caacf1e3489bff8cf5d0af5d8
## Deleted: sha256:1a80408de790c0b1075d0a7e23ff7da78b311f85f36ea10098e4a6184c200964
## Deleted: sha256:eb6b01329ebe73e209e44a616a0e16c2b8e91de6f719df9c35e6cdadadbe5965

Committing changes

Generally, it is better to use a Dockerfile to manage your images in a documented and maintainable way but if you still want to commit changes to your container (like you would for Git), read on.

When you log out of a container, the changes made are still stored; type docker ps -a to see all containers and the latest changes. Use docker commit to commit your changes.

docker ps -a

# git style commit
# -a, --author=       Author (e.g., "John Hannibal Smith <[email protected]>")
# -m, --message=      Commit message
docker commit -m 'Made change to blah' -a 'Dave Tang' <CONTAINER ID> <image>

# use docker history <image> to check history
docker history <image>

Access running container

To access a container that is already running, perhaps in the background (using detached mode: docker run with -d) use docker ps to find the name of the container and then use docker exec.

In the example below, my container name is rstudio_dtang.

docker exec -it rstudio_dtang /bin/bash

Cleaning up exited containers

I typically use the --rm flag with docker run so that containers are automatically removed after I exit them. However, if you don’t use --rm, by default a container’s file system persists even after the container exits. For example:

docker run hello-world

## 
## Hello from Docker!
## This message shows that your installation appears to be working correctly.
## 
## To generate this message, Docker took the following steps:
##  1. The Docker client contacted the Docker daemon.
##  2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
##     (amd64)
##  3. The Docker daemon created a new container from that image which runs the
##     executable that produces the output you are currently reading.
##  4. The Docker daemon streamed that output to the Docker client, which sent it
##     to your terminal.
## 
## To try something more ambitious, you can run an Ubuntu container with:
##  $ docker run -it ubuntu bash
## 
## Share images, automate workflows, and more with a free Docker ID:
##  https://hub.docker.com/
## 
## For more examples and ideas, visit:
##  https://docs.docker.com/get-started/

Show all containers.

docker ps -a

## CONTAINER ID   IMAGE         COMMAND    CREATED        STATUS                              PORTS     NAMES
## c42e0288ce7a   hello-world   "/hello"   1 second ago   Exited (0) Less than a second ago             angry_joliot

We can use a sub-shell to get all (-a) container IDs (-q) that have exited (-f status=exited) and then remove them (docker rm -v).

docker rm -v $(docker ps -a -q -f status=exited)

## c42e0288ce7a

Check to see if the container still exists.

docker ps -a

## CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

We can set this up as a Bash script so that we can easily remove exited containers. In the Bash script -z returns true if $exited is empty, i.e. no exited containers, so we will only run the command when $exited is not true.

cat clean_up_docker.sh

## #!/usr/bin/env bash
## 
## set -euo pipefail
## 
## exited=`docker ps -a -q -f status=exited`
## 
## if [[ ! -z ${exited} ]]; then
##    docker rm -v $(docker ps -a -q -f status=exited)
## fi
## 
## exit 0

As I have mentioned, you can use the –rm parameter to automatically clean up the container and remove the file system when the container exits.

docker run --rm hello-world

## 
## Hello from Docker!
## This message shows that your installation appears to be working correctly.
## 
## To generate this message, Docker took the following steps:
##  1. The Docker client contacted the Docker daemon.
##  2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
##     (amd64)
##  3. The Docker daemon created a new container from that image which runs the
##     executable that produces the output you are currently reading.
##  4. The Docker daemon streamed that output to the Docker client, which sent it
##     to your terminal.
## 
## To try something more ambitious, you can run an Ubuntu container with:
##  $ docker run -it ubuntu bash
## 
## Share images, automate workflows, and more with a free Docker ID:
##  https://hub.docker.com/
## 
## For more examples and ideas, visit:
##  https://docs.docker.com/get-started/

No containers.

docker ps -a

## CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

Installing Perl modules

Use cpanminus.

apt-get install -y cpanminus

# install some Perl modules
cpanm Archive::Extract Archive::Zip DBD::mysql

Creating a data container

This guide on working with Docker data volumes provides a really nice introduction. Use docker create to create a data container; the -v indicates the directory for the data container; the --name data_container indicates the name of the data container; and ubuntu is the image to be used for the container.

docker create -v /tmp --name data_container ubuntu

If we run a new Ubuntu container with the --volumes-from flag, output written to the /tmp directory will be saved to the /tmp directory of the data_container container.

docker run -it --volumes-from data_container ubuntu /bin/bash

R

Use images from The Rocker Project, for example rocker/r-ver:4.1.0.

docker run --rm rocker/r-ver:4.1.0

## Unable to find image 'rocker/r-ver:4.1.0' locally
## 4.1.0: Pulling from rocker/r-ver
## 7c3b88808835: Pulling fs layer
## 1cb7623729da: Pulling fs layer
## 480e79aa95a7: Pulling fs layer
## 3c84bfc13d4a: Pulling fs layer
## f0fbb0d3d0c7: Pulling fs layer
## 3c84bfc13d4a: Waiting
## f0fbb0d3d0c7: Waiting
## 1cb7623729da: Verifying Checksum
## 1cb7623729da: Download complete
## 3c84bfc13d4a: Verifying Checksum
## 3c84bfc13d4a: Download complete
## 7c3b88808835: Verifying Checksum
## 7c3b88808835: Download complete
## f0fbb0d3d0c7: Verifying Checksum
## f0fbb0d3d0c7: Download complete
## 7c3b88808835: Pull complete
## 480e79aa95a7: Verifying Checksum
## 480e79aa95a7: Download complete
## 1cb7623729da: Pull complete
## 480e79aa95a7: Pull complete
## 3c84bfc13d4a: Pull complete
## f0fbb0d3d0c7: Pull complete
## Digest: sha256:542eafcf270a1ec563b82c089650193335d59f0d8eb448cabc52b56fe47dc22d
## Status: Downloaded newer image for rocker/r-ver:4.1.0
## 
## R version 4.1.0 (2021-05-18) -- "Camp Pontanezen"
## Copyright (C) 2021 The R Foundation for Statistical Computing
## Platform: x86_64-pc-linux-gnu (64-bit)
## 
## R is free software and comes with ABSOLUTELY NO WARRANTY.
## You are welcome to redistribute it under certain conditions.
## Type 'license()' or 'licence()' for distribution details.
## 
##   Natural language support but running in an English locale
## 
## R is a collaborative project with many contributors.
## Type 'contributors()' for more information and
## 'citation()' on how to cite R or R packages in publications.
## 
## Type 'demo()' for some demos, 'help()' for on-line help, or
## 'help.start()' for an HTML browser interface to help.
## Type 'q()' to quit R.
## 
## >

Saving and transferring a Docker image

You should just share the Dockerfile used to create your image but if you need another way to save and share an iamge, see this post on Stack Overflow.

docker save -o <save image to path> <image name>
docker load -i <path to image tar file>

Here’s an example.

# save on Unix server
docker save -o davebox.tar davebox

# copy file to MacBook Pro
scp [email protected]:/home/davetang/davebox.tar .

docker load -i davebox.tar 
93c22f563196: Loading layer [==================================================>] 134.6 MB/134.6 MB
...

docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
davebox             latest              d38f27446445        10 days ago         3.46 GB

docker run davebox samtools

Program: samtools (Tools for alignments in the SAM format)
Version: 1.3 (using htslib 1.3)

Usage:   samtools <command> [options]
...

Pushing to Docker Hub

Create an account on Docker Hub; my account is davetang. Use docker login to login and use docker push to push to Docker Hub (run docker tag first if you didn’t name your image in the format of yourhubusername/newrepo).

docker login

# create repo on Docker Hub then tag your image
docker tag bb38976d03cf yourhubusername/newrepo

# push
docker push yourhubusername/newrepo

Tips

Tip from https://support.pawsey.org.au/documentation/display/US/Containers: each RUN, COPY, and ADD command in a Dockerfile generates another layer in the container thus increasing its size; use multi-line commands and clean up package manager caches to minimise image size:

RUN apt-get update \
      && apt-get install -y \
         autoconf \
         automake \
         gcc \
         g++ \
         python \
         python-dev \
      && apt-get clean all \
      && rm -rf /var/lib/apt/lists/*

I have found it handy to mount my current directory to the same path inside a Docker container and to set it as the working directory; the directory will be automatically created inside the container if it does not already exist. When the container starts up, I will conveniently be in my current directory. In the command below I have also added the -u option, which sets the user to <name|uid>[:<group|gid>].

docker run --rm -it -u $(stat -c "%u:%g" ${HOME}) -v $(pwd):$(pwd) -w $(pwd) davetang/build:1.1 /bin/bash

Useful links

A quick introduction to Docker
The BioDocker project; check out their Wiki, which has a lot of useful information
The impact of Docker containers on the performance of genomic pipelines
Learn enough Docker to be useful
10 things to avoid in Docker containers
The Play with Docker classroom brings you labs and tutorials that help you get hands-on experience using Docker
Shifter enables container images for HPC
http://biocworkshops2019.bioconductor.org.s3-website-us-east-1.amazonaws.com/page/BioconductorOnContainers__Bioconductor_Containers_Workshop/

philippbayer / learning_docker Goto Github PK

learning_docker's Introduction

Table of Contents