projectatomic / container-best-practices Goto Github PK

Container Best Practices

License: Other

Makefile 28.34% Python 71.66%

container-best-practices's Introduction

Container Best Practices Guide

A collaborative project to document container-based application architecture, creation and management. The built version of this source can be viewed at http://docs.projectatomic.io/container-best-practices/

Contributing

Please refer to the asciidoc user's guide: http://asciidoctor.org/docs/asciidoc-writers-guide/

Before submitting a pull request:

Compile the document, make html
run make check to spellcheck the documents. Update personal dictionary file containers.dict with any non-English words.

Working with files

Compile docs from a container

Create local index.html file

sudo docker run --rm -it -v `pwd`:/documents/:Z asciidoctor/docker-asciidoctor make html

Clean up files

make clean

This removes all generated files.

Publish

make publish

Github serves HTML documents from the gh-pages branch. This command will push to the branch.

Spell check

aspell is used to spell check the document. This is run in a Travis job for all pull requests. Any non-English words should be added to containers.dict with your commit. Ensure make check passes.

container-best-practices's People

Contributors

Stargazers

Watchers

Forkers

scollier eliskasl aweiteka sctweedie adimania ekuric mfojtik adelton zdover23 incnb praiskup muthhus cgutshal baude maxamillion isaagar mohammedzee1000 fatherlinux howardmei imcleod dav1x jlebon hhorak cgwalters cloudxtreme glenrmillard wtrocki chuanchang srve4 eformat bigg01 tchughesiv pradeepto gbraad goern codificat rpitonak binarycloudops-org tomastomecek jkhelil lslah bkabrda bluemutedwisdom tomaszklosinski matejak rongfengliang mythicai jmtd maduhu artb1sh timoses davidpesticcio pprasadk matthooper1 caot fizz hauwenc nmars tigerhou beaver-company ralvares tanukus edmund-troche mhoop devopstoday11 bovito roman-kpax jkeam

container-best-practices's Issues

Enable different distributions of CBP

The CBP content should be packaged for different audiences. For example, the Fedora upstream community would like a set of content customized for their needs. Add asciibinder support to enable tagging of different content.

This involves reorganizing some content into logical files so topics are easily managed. It also involves enabling Asciibinder for the project and updating documentation for how to contribute to the guide.

Empty chapter Template

There is a placeholder for a Template chapter. We have already drafted a generic template in https://github.com/container-images/container-image-template and even a templating tool in https://github.com/devexp-db/distgen/.

Creating: Updating Software supplied by Base Image

I disagree with this quite strongly. I don't think this is a general best practice. I also think this section needs to be broken into two sections:

Create base layers 1
Creating Layered Images

There are very different strategies for each:

Base Images: operations teams probably create base images and they absolutely want to update base images before they publish them. There is no other way to do it. Furthermore, the ops team may start with rhel7 and create rhel7-ourcorebuild. In this scenario, they will be creating a layered image, but may want to squash it. Either way, they want to do a yum update and should be recommended to do so from a security perspective.
Layered Images; I also disagree that you shouldn't do a yum update as part of a Dockerfile which builds a layered image. It's EVERYWHERE for a reason. Upstream vendors, partners, friends, family and anyone else that makes images suck!!! It's like letting family borrow money, it always puts you in a bad place. End users absolutely need to be able to do yum (and recommended that this is OK) when their upstreams burn them (which they will). There will always be some excuse why the upstream hasn't updated an image (build system broken, can't automate for some reason, licensing, who knows), but the downstream user should never be blocked by this. If it breaks that is a bug. If the upstream does something that is fragile, that is an anti-pattern. If you build an image with the assumption that others will consume it, you better be able to do a yum update....

Emphasize that content is very important in container and that containers are not VM

When talking about containers, content is a very important thing. The content is important especially when we compare linux container technology with classic virtual machine. Both is kind of a type of virtualization, where we want to isolate separate applications, often called as microservices, but we cannot consider containers the same as virtual machines.

The big difference between linux containers and virtual machine is the guest's operating system, that is entirely missing in containers scenery, because all containers share the kernel with the host.

That makes the containers much more efficient, but the fact that the kernel is shared with the host means, that some unfortunate security flaw in the host kernel creates potential door from the container, which may influence either other containers or the host itself.

provide url with production deployment

I've tried to find it, but didn't succeed. Not everyone wants to build from source.

Container patterns

Add info about container patterns (sidecar, ambassador, adapter...) or an overview with links to appropriate resources.
Could fit into Application Planning.

help.md file for each container

I thinks each container should provide to user some help page.
Fedora has requirement for providing such file https://fedoraproject.org/wiki/Container:Guidelines#Help_File

The help.md file should contain some required fields like:

image name
maintainer
name
description
usage
if container uses environment variables then section ENVIRONMENT VARIABLES
if container exposes ports then also section SECURITY IMPLICATION

e.g. help.md for memcached:

% MEMCACHED(1) Container Image Pages
% Petr Hracek
% February 6, 2017

# NAME
memcached - Memcached is a high-performance, distributed memory object caching system

# DESCRIPTION
Memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.

The container itself consists of:
    - fedora/f26 base image
    - memcached RPM package

Files added to the container during docker build include: /files/memcached.sh

# USAGE
To get the memcached container image on your local system, run the following:

    docker pull docker.io/modularitycontainers/memcached

  
# ENVIRONMENT VARIABLES

The image recognizes the following environment variables that you can set
during initialization be passing `-e VAR=VALUE` to the Docker run command.

|     Variable name        |       Description                                           |
| :----------------------- | ----------------------------------------------------------- |
| `MEMCACHED_DEBUG_MODE`   | Increases verbosity for server and client. Parameter is -vv |
| `MEMCACHED_CACHE_SIZE`   | Sets the size of RAM to use for item storage (in megabytes) |
| `MEMCACHED_CONNECTIONS`  | The max simultaneous connections; default is 1024           |
| `MEMCACHED_THREADS`      | Sets number of threads to use to process incoming requests  |

        
# SECURITY IMPLICATIONS
Lists of security-related attributes that are opened to the host.

-p 11211:11211
    Opens container port 11211 and maps it to the same port on the host.

# SEE ALSO
Memcached page
<https://memcached.org/>

atomic CLI

Expand section Explicit Initialization - Atomic CLI

Standardized documentation for image

RHSCL images already use this practice: When image is run with command container-usage, it prints usage message.

$> sudo docker run <image> container-usage
This image provides....

Basic usage:
   docker run <image> -e MYVAR=variable

More information at http://...

We should document this and enforce it in all images.

Create as small images as possible

For making containers small, we can use several tricks. One trick is not to install documentation:

RUN yum -y --setopt=tsflags=nodocs install postgresql-server

Another trick is to clean the cache, that yum used during the install, because that one might be quite big:

RUN yum -y --setopt=tsflags=nodocs install postgresql-server && yum clean all

We should also always create images from scratch and not updating existing images, because by adding more layers on top, we just make the container image bigger.

some feedback on the help.md/.1 guidelines

Hi, I've been looking at this for the middleware images looked after by Cloud Enablement, and I have some early feedback. I will revise this issue/comments as my feedback develops.

It seems an interesting design choice to make the canonical in-image help file the roff/nroff/troff-formatted "help.1" file, especially if the intention is for the help information to be consumed by various other systems and republished. This appears to have been strongly influenced by the first (and so far, only?) tool to consume it. "atomic help" which (I presume) is acting as a thin pipe through to man(1). You don't stipulate precisely which roff language variation should be use here, I guess you are assuming the host system is using GNU man and whatever that implies for the format, but it would be good to state it.

The Markdown-formatted version of the file, I would have thought was more useful for other tools to parse, but my reading of the guidelines at the moment, it's strictly optional. If someone provides a .md file, it will be converted into the help.1 file by atomic reactor. But if you instead explicitly provided a help.1 file and added it to / yourself, it should stil work. (right?) Hence, the markdown file is not mandated by the guidelines.

Why would anyone do that you might ask? Well, in our case we are likely to be auto-generating the help page via a template. We use cekit to generate our Dockerfiles and associated artefacts from an input YAML file. We already template the Dockerfile, so we would likely template the help file too. And if we're doing that, we could skip markdown and go straight to the help.1. The main stated reason for providing the markdown in image sources would be to have them rendered nicely if the image sources were being browsed on GitHub, but if we are generating them, they would not be committed to our image sources anyway.

The other problem with the use of Markdown in these guidelines is you don't strictly specify which version of Markdown you actually mean. And we can't assume some safe subset of all popular implementations because your examples include a construction like this

MYSQL_PASSWORD=mypass
                The password set for the current MySQL user.

which is not a semantically meaningful construction in many markdown implementations (including, I think, GitHub-Flavored-Markdown) and a sequence of these will just be a run-on sentence.

Whilst investigating exactly which Markdown you are impling, I chased down the tool chain: atomic uses go-md2man (which states "** Work in Progress ** This still needs a lot of help to be complete, or even usable!" in it's README), go-md2man seems to punt the issue to its backend blackfriday, which says "markdown with common extensions", again, which markdown? the original? CommonMark?

So from a generator POV we side-step all the issues of Markdown version clarity by going straight to roff and we are left with just the issues of roff/troff/nroff ambiguity. But I can't help but think that the intention of the guidelines may have originally been to have the markdown itself be canonical, since it might be more easy to consume by other tools than roff, and it's just ended up a bit backwards because the first concrete tool ended up being "atomic help".

Where to store related components

This section needs expansion. Also add reference to https://github.com/container-images/container-image-template

Consider to use label/annotation specs from OCI

Hi,

I don't know what your relations to the OCI are, but when I just saw #labels the first time, I immediately thought it would make sense to align your best-practices with https://github.com/opencontainers/image-spec/blob/master/annotations.md

Another labelling approach (https://github.com/label-schema) already deprecated their standard in favour of OCI. At https://github.com/opencontainers/image-spec/blob/master/annotations.md#back-compatibility-with-label-schema you can even see how pragmatic they noted down the migration path from label-schema to OCI.

Cheers

Tobias

production instance is down: 503

$ curl -v http://docs.projectatomic.io/container-best-practices/
*   Trying 54.87.143.15...
* TCP_NODELAY set
* Connected to docs.projectatomic.io (54.87.143.15) port 80 (#0)
> GET /container-best-practices/ HTTP/1.1
> Host: docs.projectatomic.io
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 503 Service Temporarily Unavailable
< Date: Mon, 12 Feb 2018 14:44:55 GMT
< Content-Length: 411
< Connection: close
< Content-Type: text/html; charset=iso-8859-1
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>503 Service Temporarily Unavailable</title>
</head><body>
<h1>Service Temporarily Unavailable</h1>
<p>The server is temporarily unable to service your
request due to maintenance downtime or capacity
problems. Please try again later.</p>
<hr>
<address>Apache/2.2.15 (Red Hat) Server at docs.projectatomic.io Port 80</address>
</body></html>
* Closing connection 0

Also, no TLS?

$ curl -v https://docs.projectatomic.io/container-best-practices/
*   Trying 54.87.143.15...
* TCP_NODELAY set
* Connected to docs.projectatomic.io (54.87.143.15) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: C=US; ST=North Carolina; L=Raleigh; O=Red Hat Inc.; CN=*.rhcloud.com
*  start date: Apr  7 00:00:00 2015 GMT
*  expire date: Apr 11 12:00:00 2018 GMT
*  subjectAltName does not match docs.projectatomic.io
* SSL: no alternative certificate subject name matches target host name 'docs.projectatomic.io'
* stopped the pause stream!
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):
curl: (51) SSL: no alternative certificate subject name matches target host name 'docs.projectatomic.io'

Is mounting /etc/localtime to a container OK without also mounting /etc/timezone?

I found your tip of mounting the hosts /etc/localtime to the container, but it looks like Debian distros don't depend on /etc/localtime being a symlink to the tzdata file, they use the /etc/timezone file to identify the timezone. So is it OK to mount /etc/localtime from the host to container without also mounting /etc/timezone? Even if the OS did still depend on /etc/localtime being a symlink, it doesn't seem that this symlink association would be visible to the container.

Hosted/built version cert: NET::ERR_CERT_COMMON_NAME_INVALID

Just a moment ago, the hosted/built version of these docs was working for me but now I get the NET::ERR_CERT_COMMON_NAME_INVALID HTTPS/TLS certificate error:

SSL Server Certificate
Common Name (CN)	*.b9ad.pro-us-east-1.openshiftapps.com
Organization (O)	<Not Part Of Certificate>
Organizational Unit (OU)	<Not Part Of Certificate>
Common Name (CN)	R3
Organization (O)	Let's Encrypt
Organizational Unit (OU)	<Not Part Of Certificate>
Issued On	Monday, March 22, 2021 at 9:32:05 PM
Expires On	Sunday, June 20, 2021 at 9:32:05 PM
SHA-256 Fingerprint	A6 A2 60 C4 9D 4A 5A EA 5A 50 58 04 92 C7 2D 6A 49 6F F2 A2 D1 74 83 54 87 0A 1B ED 4E 86 37 6B
SHA-1 Fingerprint	D2 60 D7 BA B4 F7 06 93 1A FC C7 9C 86 C7 31 1A 4E 0A 95 AF

Visibility on Project Atomic site

CBP is not listed here http://www.projectatomic.io/docs/ and therefore might not be that easy to find :(
The site still points to the old guidance instead: http://www.projectatomic.io/docs/docker-image-author-guidance/

consider LABEL Version, Vendor

container-best-practices/content/dockerfile_instructions/dockerfile_instructions.adoc does not talk about Version or Vendor, maybe its worth it.

Preparing apps for containeriziation

In the Best Practices list, perhaps I've missed it but I don't see much in the way of architectural guidelines:

Decomposing your (host based) application/service
exposing and defining container interfaces
- network
- shared files
Container startup
- avoid sequencing: Images must poll and self-assemble
- balance in-container polling and Kube restart-on-exit looping
- detecting uninitialized resources, initialize, handling race conditions
Passing in parameters
- environment variables
- container startup CLI
- host files (via volume)
- config/setup tarballs (from remote)

Others?

MTF into Testing

Possibly also merge with the Linter section.

Use reproducible builds

The following is an example of how we can create a very simple container:

What we must do first is to install and then to run the docker daemon:

#> yum install -y docker
#> systemctl start docker

Then we can download some image, that we'll use as a base of our image. Let's use something we trust, for example Red Hat Enterprise Linux 7.2:

#> docker pull rhel7:7.2

Then, one way to create an image, is simply to run the container:

#> docker run -ti --name mycont rhel7:7.2 bash

Then do some reasonable work, like creating some content -- in this case a file:

[root@a1eefecdacfa /]# echo Hello Dojo > /root/greeting
[root@a1eefecdacfa /]# exit

And finally, using docker commit we can create an image:

#> docker commit mycont
0bdcfc5ba0602197e2ac4609b8101dc8eaa0d8ab114f542ab6b2f15220d0ab22```

The long hash is its identification and we can use it for running containers. However, once we'd decide to do something more complicated, we'd have troubles to make it again.

So, we should always do things the way, we can reproduce it, which means using Dockerfiles instead.

This example results in the same output as the example before, except we can repeat it as many times we want and always get the same output. It also helps understanding the docker itself more as a packaging format, than anything else:

#> cat Dockerfile
FROM rhel7:7.2
RUN echo Hello Dojo > /root/greeting

#> docker build .

Running help/usage as the default command

In the Starting your application section we already mention running a script as one of the ways of starting an application. But it is also possible to just output usage information in case the app is executed in a particular way, for example languages with the s2i tool.

systemd documentation

Need to cover managing systemd services inside a container, and
using systemd on the host to manage a (probably SPC) container as a
service.

Both are valuable, and we should make clear to distinguish between the
two.

Also, both could do with a bit of testing to make sure that we really
know what works and what does not in each case. Eg. ISTR we had a
number of issues come up when trying to run the IPA container with an
internal systemd --- there are interactions with --pid=host,
expectations around orphan handling etc.

But I also think we need a much higher-level section on what app
containerisation broadly looks like --- ie. why you don't need a
systemd at all by default, and how you generally want to set up the CMD
to point directly at the daemon/service in the container; why it's
important to keep persistent data separate in the container; what the
container update lifecycle looks like, etc.

Potential content for Fedora: https://vpavlin.eu/2015/02/fedora-docker-and-systemd/