Coder Social home page Coder Social logo

Comments (16)

baude avatar baude commented on June 29, 2024

can you provide a link to the content you are objecting to?

from container-best-practices.

fatherlinux avatar fatherlinux commented on June 29, 2024

Yeah, sorry about that, I realized after I sent that I should have included a link:

https://github.com/projectatomic/container-best-practices/blob/master/creating/creating_index.adoc

from container-best-practices.

eliskasl avatar eliskasl commented on June 29, 2024

Hey,

I'm just writing about the base images so I'll reorganize the whole chapter a bit.

About using yum update, this part was written about a year ago, and there have been many discussions since that so it definitely needs to be rewritten. I'll try to get back to it soon in case nobody else beats me to it.

from container-best-practices.

baude avatar baude commented on June 29, 2024

@fatherlinux We have socialized and considered your input. Our advice to those who develop images (like ISVs or developers) is still that they should not update a provided base image to obtain new packages. It remains on the provider to keep those images fresh and updated particularly when it comes to security updates. And again, remember this is a best practice and not a hard lined rule.

That said, we do feel that if users want to update images provided to them, it should be done on their own accord and should be done by not altering the image itself (nor a running container) but rather as part of a dockerfile where they create their own image.

Given the audience for this document is for developers, I think our message still holds. But if you like, we could clarify in the appendix with a revision or summary of the blog post I shared.

from container-best-practices.

scollier avatar scollier commented on June 29, 2024

+1 @baude

from container-best-practices.

fatherlinux avatar fatherlinux commented on June 29, 2024

Perhaps, I misunderstand the use case? How does being developer make a difference? Could you deacribe the persona and use case this guide is for? Are you saying that a "developer" would pull a base image from a core build (rehl7-ourcorebuild) the ops team has already doe the yum update on? I "might" agree in that scenario, but whenever cargo crosses a frontier (countries in real life, vendor to customer in software) the downstream is responsible. So to state that another way, I might agree within a single organization, but I can NOT agred when organization boundries are crossed...

I have literally had this conversation with hundreds of customers, I would be more than happy to arrange a call with a couple of big ops teams if you need evidence of what customers want, not we think is philosophically right?

I very, very rarely could a yum update break a downstream build. If it does, the "developer has time to fix it (or bypass it, worst case) while doing the docker build. This d the whole magic of Docker over puppet. You do the loading of the container at the factory instead of the Dock.

This is something we all need to come tto agreement on because I am publishimg tons of articles around supply chain and they flatly disagree with this.

IMHO opinion, never block someone downstream. Almost every Centos Dockerfile out there starts with a "yum update", this would buck that trend and I disagree pretty strongly...

from container-best-practices.

fatherlinux avatar fatherlinux commented on June 29, 2024

I need better understanding of what yoy guys mean by:

  1. Developer
  2. What she is doing?
  3. What you consider a base image....

from container-best-practices.

baude avatar baude commented on June 29, 2024

Sure, in this document, a developer is the person who is writing a dockerfile (for their application) who inherits a base image. A base image, in general terms, is the minimal container operating system of a distribution like docker.io/centos.

On your last paragraph there, this is a best practice -- a recommendation of sorts. There is no switch to stop it. I think reasoning documented are justified but can be overriden as they wish.

from container-best-practices.

fatherlinux avatar fatherlinux commented on June 29, 2024

So, there is talk of doing funky stuff with a layers to remove yum, which would make it mandatory, which is why it is so critical to get everybody on the same page.

from container-best-practices.

mfojtik avatar mfojtik commented on June 29, 2024

@fatherlinux doing yum update in Dockerfile comes with consequences:

  1. Each image you have in registry will have different base layer based on the date/time the yum update was called. That results into more more storage usage in docker registry.
  2. Updates might create inconsistency between images as you are loosing track of what version is installed where (assuming you're not rebuilding all images at once).
  3. Having "yum update" basically tells that the supplier of your base image sucks and fail to update it regularly for you. Doing so yourself in upper layer is just workaround for this issue and the real fix should be "trust your image supplier".
  4. Having said that, the image supplier might spend some effort on testing the base image to provide best experience. That is not always correlated with having "latest" versions installed as they might not be well tested.

from container-best-practices.

fatherlinux avatar fatherlinux commented on June 29, 2024

@mfojtik I will address each inline. Also tagging in @rhatdan :

  1. Each image you have in registry will have different base layer based on the date/time the yum update was called. That results into more more storage usage in docker registry.

I think I know what you are trying to say, but these are not called "base layers" 1. I would refer to them as intermediate layers, but I think I understand what you are saying. I believe you are trying to say there is a Turing complete problem at the intermediary layer, and your logic would be correct. There are still a couple of problems.

This turning complete problem exists for any RUN command, not just yum updates - most critically it exists for "yum install" lines which could add a lot more data to an intermediary layer. Worse yet, is the fact that a yum install could cause any number of dependencies to be updated, which creates another Turing complete problem of an unknown/untested permutation of packages in the intermediary layer.

The safest method is to do a yum update to the latest greatest provided by your upstream RPM repository (which should be a tested set of packages in a Satellite channel snapshot or Content View) 2. While doing a yum update will create a new intermediary layer, it CAN be known if you have good package hygiene and make sure you always use a Satellite Content View or Channel Snapshot.

Then the operations team will have good control at that first intermediary layer. Honestly, if they are doing their job correctly, a developer's "yum update" should have no effect, so the recommendation becomes useless. It's a win/win and again, I think a useless recommendation.

  1. Updates might create inconsistency between images as you are loosing track of what version is installed where (assuming you're not rebuilding all images at once).

Incorrect, this problem should and would always be mitigated by repositories/satellite channel snapshots/content views. In fact a 'yum -y update" is the ONLY way to know what is in the intermediary layer. Again, doing arbitrary updates of only certain packages (which is what this guide recommends" produces a Turing complete package of an arbitrary but unknown set of permutations at the intermediary layer.

Stated for clarity, this guide recommends that a developer only update certain packages which would cause this problem, not mitigate it. Real problem, not solved by this recommendation, only solved with Satellite Content Views/Channel Snapshots.

  1. Having "yum update" basically tells that the supplier of your base image sucks and fail to update it regularly for you. Doing so yourself in upper layer is just workaround for this issue and the real fix should be "trust your image supplier".

I will not argue this point, it's philosophy, not science. I am going to write an article called Why Michael Crosby is wrong 3.

Trusting a supplier is a myth. There was never trust with ISOs, and we are years away from some sort of manifest standard that would let me trust a docker base layer. Almost every operations team I have talked to is and will continue to do "yum updates". Sorry, this is science": "what are people doing?" vs. philosophy: "what people aught to do."

Again, ask people what they are doing now, don't tell them. You will gain a lot more wisdom.

  1. Having said that, the image supplier might spend some effort on testing the base image to provide best experience. That is not always correlated with having "latest" versions installed as they might not be well tested.

Yeah, we (Red Hat) have this problem. It's why we delayed Docker 1.9. This also proves my point. What if the upstream supplier makes a decision that I don't like? What if they delay the release of a patch because it breaks some of "their" software, but not mine? Then I am screwed and can't get an update. The only way around this is to have freedom at each layer of the supply chain. Organizations MUST be allowed to do their own yum updates.

I feel like nobody arguing this really has a security background, you cannot limit organizations from updating RPM content, the content is getting produced for a reason. Docker is a shipping container, we still use barrels, boxes, bags, and crates inside the shipping container because that is typically what we are good at loading at the factory. The same is true with RPMs (for now)...

from container-best-practices.

fatherlinux avatar fatherlinux commented on June 29, 2024

Also, I want to make the simple observation that having more permutations on disk and hence using more disk space is not as big of a business problem as having an exploit in one of my applications. So, a yum update will always trump the permutations argument....

from container-best-practices.

dav1x avatar dav1x commented on June 29, 2024

I have a fair amount of experience with operations to scale, security scans (qualys and the like) and multiple business sectors requiring vastly different images/builds.

Generally speaking, the security guys were the ones most concerned with packages being upgraded every time a release was pushed. The problem with this was applications relied on specific versions and builds and we couldn't update them every time a new package was released. There were always situations where CVE always trump the norms.

That being said, IMO it makes more sense to specify the exact build you need in the FROM line of the Dockerfile so that the consumer has precise control as to the package versions and application requirements. There will always be use cases where a "yum update -y" is required. But, I believe more customers and consumers would be more interested in having that precise control.

from container-best-practices.

fatherlinux avatar fatherlinux commented on June 29, 2024

Interedting perspective. I don't vompletely disagree. I like the clean choise in the FROM line.

That said, again, the ops team in a business should be controlling the package sets, not the devs. Ops should have a strategy for not breaking builds (which maybe is the control point to devs. e.g. tagging builds), like:

  1. Capturing when things break and building tests as part of the resolution.
  2. Using something like RHEL that provides good stability (api/abi)
  3. Good support lifecycle, etc.

That said, things will break in the developer world if they run a yum update, but they will also break when the vendor/ops team does a base image update, who runs the yum udate is irrelevant. As a developer I would rather:

  1. Make that choise myself
  2. Have my ops team make that choise and test (Satellite, Dockerfiles with yum updates, and tests)
  3. Not have the upstream image vendor force me into the latest update without understanding what it might break.

If there is an ops team managing the builds internally, I think I can live with a no developer yum updates being a recommendation. I think this hits th 80/20 rule.

Restated simply:

  • ops = yum update
  • devs = FROM

This does have some downsides. Ops/ security is not going to be happy if devs refuse to pick up a new base image/tag that has critical security updates in it....

from container-best-practices.

langdon avatar langdon commented on June 29, 2024

To your closing point, this is the crux of the interest by devs in containers. "SAs/Ops can't break my application by applying an unapproved update." Containers let me, as a developer, actually test the changes that an updated library/rpm make to my application. And, particularly with advanced container tooling, they allow the developer to re-build, test, and ship the new, tested app to the ops folks for deployment (or directly, depending on the infra) in a reasonable (per ops) timeframe.

Fundamentally, this conversation is about why developers want to vendor-ize all the things they depend on. Developers don't care (usually) about the OS firewall getting a patch, or ssh getting a patch, but, stay away from patching anything in the developer's direct stack without testing the patch.

from container-best-practices.

fatherlinux avatar fatherlinux commented on June 29, 2024

http://rhelblog.redhat.com/2016/02/24/container-tidbits-can-good-supply-chain-hygiene-mitigate-base-image-sizes/

from container-best-practices.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.