Smaller Java image using Docker multi-stage build

Two of the announcements at DockerCon 2017 directly relevant to Java developers are:

Docker Multi-stage build
Oracle JRE in Docker Store

This blog explains the purpose of Docker multi-stage build and provide examples of how they help us generate smaller and more efficient Java Docker images.

Just show me the code: github.com/arun-gupta/docker-java-multistage.

What is the issue?

Building a Docker image for a Java application typically involves building the application and package the generated artifact into an image. A Java developer would likely use Maven or Gradle to build a JAR or WAR file. If you are using the Maven base image to build the application then it will download the required dependencies from the configured repositories and keep them in the image. The number of JARs in the local repository could be significant depending upon the number of dependencies in the pom.xml. This could leave a lot of cruft in the image.

Let’s take a look at a sample Dockerfile:

FROM maven:3.5-jdk-8

COPY src /usr/src/myapp/src
COPY pom.xml /usr/src/myapp
RUN mvn -f /usr/src/myapp/pom.xml clean package

ENV WILDFLY_VERSION 10.1.0.Final
ENV WILDFLY_HOME /usr

RUN cd $WILDFLY_HOME && curl http://download.jboss.org/wildfly/$WILDFLY_VERSION/wildfly-$WILDFLY_VERSION.tar.gz | tar zx && mv $WILDFLY_HOME/wildfly-$WILDFLY_VERSION $WILDFLY_HOME/wildfly

RUN cp /usr/src/myapp/target/people-1.0-SNAPSHOT.war $WILDFLY_HOME/wildfly/standalone/deployments/people.war

EXPOSE 8080

CMD ["/usr/wildfly/bin/standalone.sh", "-b", "0.0.0.0"]

FROM maven:3.5-jdk-8

COPY src /usr/src/myapp/src

COPY pom.xml /usr/src/myapp

RUN mvn -f /usr/src/myapp/pom.xml clean package

ENV WILDFLY_VERSION 10.1.0.Final

ENV WILDFLY_HOME /usr

RUN cd $WILDFLY_HOME && curl http://download.jboss.org/wildfly/$WILDFLY_VERSION/wildfly-$WILDFLY_VERSION.tar.gz | tar zx && mv $WILDFLY_HOME/wildfly-$WILDFLY_VERSION $WILDFLY_HOME/wildfly

RUN cp /usr/src/myapp/target/people-1.0-SNAPSHOT.war $WILDFLY_HOME/wildfly/standalone/deployments/people.war

EXPOSE 8080

CMD ["/usr/wildfly/bin/standalone.sh", "-b", "0.0.0.0"]

In this Dockerfile:

maven:3.5-jdk-8 is used as the base image
Application source code is copied to the image
Maven is used to build the application artifact
WildFly is downloaded and installed
Generated artifact is copied to the deployments directory of WildFly
Finally, WildFly is started

There are several issues with this kind of flow:

Using maven as the base image restricts on what functionality is available in the image. This requires WildFly to be downloaded and configured explicitly.
Building the artifact downloads all Maven dependencies. These stay in the image and are not needed at runtime. This causes an unnecessary bloat in the image size at runtime.
Change in WildFly version will require to update the Dockerfile. This would’ve been much easier if we could use the jboss/wildfly base image by itself.
In addition, unit tests may run before packaging the artifact and integration tests after the image is created. The test dependencies and results is again not needed to live in the production image.

There are other ways to build the Docker image. For example, splitting the Dockerfile into two files. The first file will then build the artifact and copy the artifact to a common location using volume mapping. The second file will then pick up the generated artifact and then use the lean base image. This approach has also has issues where multiple Dockerfiles need to be maintained separately. Additional, there is an out-of-band hand-off between the two Dockerfiles.

Let’s see how these issues are resolved with multi-stage build.

What are Docker multi-stage build?

Multi-stage build allows multiple FROM statements in a Dockerfile. The instructions following each FROM statement and until the next one, creates an intermediate image. The final FROM statement is the final base image. Artifacts from intermediate stages can be copied using COPY --from=<image-number>, starting from 0 for the first base image. The artifacts not copied over are discarded. This allows to keep the final image lean and only include the relevant artifacts.

FROM syntax is updated to specify stage name using as <stage-name>. For example:

FROM maven:3.5-jdk-8 as BUILD

FROM maven:3.5-jdk-8 as BUILD

This allows to use the stage name instead of the number with --from option.

Let’s take a look at a sample Dockerfile:

FROM maven:3.5-jdk-8 as BUILD

COPY src /usr/src/myapp/src
COPY pom.xml /usr/src/myapp
RUN mvn -f /usr/src/myapp/pom.xml clean package

FROM jboss/wildfly:10.1.0.Final

COPY --from=BUILD /usr/src/myapp/target/people-1.0-SNAPSHOT.war /opt/jboss/wildfly/standalone/deployments/people.war

FROM maven:3.5-jdk-8 as BUILD

COPY src /usr/src/myapp/src

COPY pom.xml /usr/src/myapp

RUN mvn -f /usr/src/myapp/pom.xml clean package

FROM jboss/wildfly:10.1.0.Final

COPY --from=BUILD /usr/src/myapp/target/people-1.0-SNAPSHOT.war /opt/jboss/wildfly/standalone/deployments/people.war

In this Dockerfile:

There are two FROM instructions. This means this is a two-stage build.
maven:3.5-jdk-8 is the base image for the first build. This is used to build the WAR file for the application. The first stage is named as BUILD.
jboss/wildfly:10.1.0.Final is the second and the final base image for the build. WAR file generated in the first stage is copied over to this stage using COPY --from syntax. The file is directly copied in the WildFly deployments directory.

Let’s take a look at what are some of the advantages of this approach.

Advantages of Docker multi-stage build

One Dockerfile has the entire build process defined. There is no need to have separate Dockerfiles and then coordinate transfer of artifact between “build” Dockerfile and “run” Dockerfile using volume mapping.
Base image for the final image can be chosen appropriately to meet the runtime needs. This helps with reduction of the overall size of the runtime image. Additionally, the cruft from build time is discarded during intermediate stage.
Standard WildFly base image is used instead of downloading and configuring the distribution manually. This makes it a lot easier to update the image if a newer tag is released.

Size of the image built using a single Dockerfile is 816MB. In contrast, the size of the image built using multi-stage build is 584MB.

So, using a multi-stage helps create a much smaller image.

Is this a typical way of building Docker image? Are there other ways by which the image size can be reduced?

Sure, you can use docker-maven-plugin as shown at github.com/arun-gupta/docker-java-sample to build/test the image locally and then push to repo. But this mechanism allows you to generate and package artifact without any other dependency, including Java.

Sure, maven:jdk-8-alpine image can be used to create a smaller image. But then you’ll have to create or find a WildFly image built using jdk-8-alpine, or something similar, as well. But the cruft, such as maven repository, two Dockerfiles, sharing of artifact using volume mapping or some other similar technique would still be there.

There are other ways to craft your build cycle. But if you are using Dockerfile to build your artifact then you should seriously consider multi-stage builds.

19 thoughts on “Creating Smaller Java Image using Docker Multi-stage Build”

Pingback: Docker多步构建更小的Java镜像 - 莹莹之色
madalin stunt cars 2 says:

May 8, 2017 at 11:14 pm

Thanks for publishing such useful information.
ajay kumar namdev says:

May 15, 2017 at 8:58 am

Nice blog very helpful keep up the good work.
Slim says:

June 23, 2017 at 6:57 am

When doing the single-stage build the dependencies downloaded by maven will not be kept into the image because the .m2 folder is declared as a VOLUME in the maven image Dockerfile. Your point is still valid, the image is larger, but not because of the maven downloads.

In both single-stage and multi-stage examples here, mvn clean package is called at image build-time, thus offering no chance of mounting an external .m2 folder so that downloading dependencies can be skipped. What I’m saying is that these approaches have the drawback of a bigger build time. Using the maven image for build, but by mounting the code and the .m2 folder and running “mvn clean package” on a container started from this image allows for caching all the maven dowloads.
PRASENJIT says:

August 23, 2017 at 8:30 pm

How can we download dependencies from a private nexus repository?
Shruthi says:

January 25, 2018 at 12:24 am

Hey, great write-up mate..!!

I actually learned Java from one of the Networking Training Institute in Chennai where I learned this technique.

Reading this here helped refresh my knowledge on this..
Thanks mate..!!
KYLE QUEST says:

February 22, 2018 at 10:16 pm

DockerSlim ( http://dockersl.im ) is another option if you want to create the smallest image possible. Haven’t tried it with jboss/wildfly though Give it a try and if it doesn’t work do you mind opening an issue?
jopina says:

August 9, 2018 at 12:25 pm

Very interesting analysis. Great information. Since last week, I am gathering details about Swift Experience . There are some amazing details on your blog which I didn’t know.i want to ask you…

Does JavaScript include other frameworks like NodeJs or AngularJs ? Same for Java, does it include Android? And PHP ? I hope there would be some overlap, even though it is marginal. Thoughts?
please Reply. Thanks.
meenal deshpande says:

October 17, 2018 at 10:59 pm

Very Well write up…
Its seem to be you have done good research oon the topic and presented it with best way.
Thanks for posting.
keep sharing, all the best
HoningDS says:

November 11, 2018 at 9:29 am

Docker does help us to become effective data scientists.For the past 5 years, I have heard lots of buzz about docker containers. It seemed like all my software engineering friends are using them for developing applications. I wanted to figure out how this technology could make me more effective but I found tutorials online either too detailed: elucidating features I would never use as a data scientist, or too shallow: not giving me enough information to help me understand how to be effective with Docker quickly.

This quick primer on docker helps to learn the things you need to know to quickly get started.
majeryytu says:

January 29, 2019 at 7:14 am

What about getting information about writing case brief? On https://college-homework-help.org/blog/case-brief I managed to find help
Ida Wallace says:

February 26, 2019 at 2:23 am

Find interesting writing here which explains the purpose of Docker multi-stage build and provide examples of how they help us generate smaller and more efficient Java Docker images.

Ida,
Marketing assignment writer,
http://www.assignmenthelpfolks.com/marketing/
CarolineWebb says:

February 28, 2019 at 9:21 pm

One of the informative blog I found here which explain about how to generate smaller and more efficient Java Docker images using Docker Multi-stage.

Caroline,
Personal Statement Tutor, who design UCAS law personal statement – http://www.personalstatementfolks.co.uk/law-personal-statement/ at Personal Statement Folks.
LavoniaDavis says:

March 26, 2019 at 11:15 am

Thank you for the article. It makes sense. I tried this on my site.
TVD says:

April 1, 2019 at 9:06 pm

Thanks for publishing
amanda stephanie says:

April 17, 2019 at 3:22 am

Interesting article. Thanks for publishing it.
Make My Assignment
alice F taylor says:

May 9, 2019 at 8:56 pm

Thanks for sharing your info. I really appreciate your efforts and I will be waiting for your further write

strike force heroes 3
ariana pham says:

May 10, 2019 at 12:15 am

Thanks for sharing us a great information that is actually helpful
run 3
phamyen says:

May 11, 2019 at 1:04 am

The article you have shared here very awesome. I really like and appreciated your work
facebook entrar

Miles to go 4.0 …

Arun Gupta is a technology enthusiast, avid runner, author of a best-selling book, globe trotter, a community guy, Java Champion, JavaOne Rockstar, JUG Leader, Minecraft Modder, NetBeans Dream Teamer, Devoxx4Kids-er, Docker Captain and works at AWS.

Creating Smaller Java Image using Docker Multi-stage Build

What is the issue?

What are Docker multi-stage build?

Advantages of Docker multi-stage build

19 thoughts on “Creating Smaller Java Image using Docker Multi-stage Build”

Leave a Reply Cancel reply