Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Section on Layers #6

Open
sstevens2 opened this issue May 31, 2019 · 7 comments
Open

Section on Layers #6

sstevens2 opened this issue May 31, 2019 · 7 comments
Assignees
Labels
peer review:editorial comment Editorial comments to be addressed from the peer review type:discussion Discussion or feedback about the lesson type:enhancement Propose enhancement to the lesson

Comments

@sstevens2
Copy link
Contributor

One of my learners from a couple of weeks ago reported back that he made an image for his work recently! 😄

He mentioned that he didn't really know much about layers until he was constructing this image. While I off-handedly mentioned layers during the workshop, he suggested maybe we add a 15 min section about layers. I also think it would be a good idea. I'm wondering if we could add a conceptual explanation and also build up a dockerfile with one layer at a time, showing that the previous layer exists and that it doesn't get re-built (unless you change something above)?

@agitter
Copy link

agitter commented May 31, 2019

👋 that was me. The workshop was great. I've shared the materials with others in my research group who couldn't attend.

I recall that another-greeting example here does intentionally fail at first. When it builds successfully, we see the cache being used:

Sending build context to Docker daemon  3.072kB
Step 1/4 : FROM python:3-slim
 ---> ca7f9e245002
Step 2/4 : WORKDIR /usr/src/app
 ---> Using cache
 ---> c0d009871ab7
Step 3/4 : COPY test.py .
 ---> 23b27e9f57a9
Step 4/4 : CMD [ "python", "./test.py" ]
 ---> Running in 0d48c16b40c1
Removing intermediate container 0d48c16b40c1
 ---> bede6575d987
Successfully built bede6575d987
Successfully tagged another-greeting:latest

At the time, I wasn't paying attention to caching. This container was so simple that I also didn't noticed the cache helping much. The best practices on build stages was very informative.

A simple lesson to highlight this could examine what happens in two different images when the layers are ordered in the recommended way versus the inverse order.

Recommended:

RUN install stable software
RUN other stuff
COPY my-file.txt .

Inverse:

COPY my-file.txt .
RUN other stuff
RUN install stable software

If we build, modify the file, and build again, would this exhibit different caching behaviors?

@dme26
Copy link
Collaborator

dme26 commented Jun 4, 2019

Thanks for your layers feedback overall @agitter ! More specifically, thanks for your suggested simple lesson on recommended and inverse layer order—that looks ideal.

Further is the possibility to demonstrate the value of merging consecutive RUN lines in terms of reducing the number of layers.

I'll definitely try to add this with credit before I use it in (non-Carpentry) teaching within a few weeks, if this doesn't emerge otherwise before that time.

@agitter
Copy link

agitter commented Jun 4, 2019

Further is the possibility to demonstrate the value of merging consecutive RUN lines in terms of reducing the number of layers.

That's a good idea. That structure confused me when I inspected Dockerfiles before learning how they worked.

@dme26 dme26 self-assigned this Jun 24, 2019
@dme26 dme26 transferred this issue from dme26/docker-introduction Sep 15, 2019
@ErinBecker ErinBecker transferred this issue from another repository Sep 17, 2019
@ChristinaLK ChristinaLK added type:enhancement Propose enhancement to the lesson type:discussion Discussion or feedback about the lesson labels Jun 2, 2021
@chendaniely
Copy link

chendaniely commented Jun 2, 2021

This can probably be done in the lesson as a callout that essentially says each command in the dockerfile is cached so there are optimization/speed/rebuilding ramifications depending on the order of commands. In general, system package installations are down towards the top and take the most amount of setup time, and the user-specific code should be put towards the bottom so changes to the user-supplied codebase does not re-trigger an entire new image build.

The "recommended" and "inverse" example (#6 (comment)) can be added to the callout to show this.

Having said all this, you'll see a lot of dockerfiles do almost everything in a single layer

@vbagadia
Copy link

  1. In terms of optimization, multi-stage builds should be considered for more complex applications, e.g. builds reaching out to some binary or any situation where order must be a priority (as mentioned and elaborated upon above). I wanted to point out this resource which walks through an example of the multi-staged builds, in example 1 which highlights performance and provides a clear example image to explain layers.

  2. Further is the possibility to demonstrate the value of merging consecutive RUN lines in terms of reducing the number of layers.

There are a few ways to do this and doing so for the 'Creating More Complex Images Section' would be appropriate.
Common syntax includes either:

RUN wget xyz && tar xyz 

OR

RUN wget xyz && \  
    tar xyz

@jcohen02
Copy link
Contributor

Thanks for highlighting the post with details about multi-stage builds @vbagadia. I think this provides a really nice overview of the different options for producing more compact images while also being clear about both the positive and negative aspects of the different approaches covered.

At the same time, I think the multi-stage build content is beyond the scope of this lesson (indeed, I don't think that you were suggesting that we include this anyway?), however showing how to combine commands so that only a single layer is generated, along the lines of the examples you provide above, is a good thing to point out.

Even if we don't go into great detail, given the introductory nature of the lesson, I definitely think adding some further content to explain more about the cache and best practices for structuring Dockerfiles would be useful. As @chendaniely suggests, a callout could be a good option for this.

@aturner-epcc aturner-epcc added the peer review:editorial comment Editorial comments to be addressed from the peer review label Jul 29, 2024
@aturner-epcc
Copy link
Contributor

Also mentioned in lesson peer review

Do we need to include this before we go back to reviewers?

@jcohen02 jcohen02 self-assigned this Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
peer review:editorial comment Editorial comments to be addressed from the peer review type:discussion Discussion or feedback about the lesson type:enhancement Propose enhancement to the lesson
Projects
None yet
Development

No branches or pull requests

8 participants