Reducing docker image size using multi-stage builds

Prologue #

In some cases, building a docker image can result to a huge image size (sometimes even Gigabytes), which can result in:

  • bad container performance
  • more security vulnerabilities
  • difficulty of distribution and deployment of containers
  • slower build of the image

In this guide, we will build an Nginx image, from Nginx source code. We will compare the size of the image building the image using a single, standard build process, with the size of the image after using multi-stage build.

All the Nginx compile from source steps can be found in Nginx Official Website Guide.

We will use Oracle Linux 7 for base image, a well established RHEL enterprise distribution. You can use any base image you want, for example CentOS.

Standard Nginx image build #

FROM oraclelinux:7-slim 

LABEL description="Nginx Image build compiled from source code, using standard build process" 

ARG NGINX_VERSION=1.19.9 

RUN yum groupinstall -y --setopt=tsflags=nodocs  "Development Tools" 
    	&& yum install -y tar gzip git gcc-c++ wget zlib-devel openssl-devel pcre-devel 
	&& yum clean all 
    	&& rm -fr /var/cache/yum 
	&& mkdir  -p /etc/nginx/binaries 
	&& mkdir /etc/nginx/modules 
	&& mkdir /etc/nginx/conf.d 

	#Downloading Nginx source
	&& cd /etc/nginx/ && wget http://nginx.org/download/nginx-${NGINX_VERSION}.tar.gz  && tar -zxvf nginx-${NGINX_VERSION}.tar.gz 
	&& rm nginx-${NGINX_VERSION}.tar.gz 
	&& mv nginx-${NGINX_VERSION}/* binaries/ 
	&& rm  -rf nginx-${NGINX_VERSION}/ 

	#Install Nginx
	&& cd binaries/ 
        && ./configure --prefix=/etc/nginx 
                 --sbin-path=/usr/local/nginx/nginx 
                 --modules-path=/usr/local/nginx/modules 
                 --conf-path=/etc/nginx/nginx.conf 
                 --error-log-path=/var/log/nginx/error.log 
                 --pid-path=/var/run/nginx.pid 
                 --lock-path=/var/run/nginx.lock 
                 --with-http_ssl_module 
                 --without-http_fastcgi_module 
                 --without-http_uwsgi_module 
                 --without-http_grpc_module 
                 --without-http_scgi_module 
                 --without-mail_imap_module 
                 --without-mail_pop3_module 
        && make && make install 
        && rm -rf /etc/nginx/binaries/ 

	#Removing packages that are no longer used
        && yum remove -y gcc-c++ zlib-devel openssl-devel pcre-develi git 
        && yum clean all

ENV PATH=/usr/local/nginx:${PATH}

CMD nginx -g 'daemon off;'

Building the image out of custom named file (not Dockerfile):

docker build -f nginx_build_from_source -t nginx:1.19.9 .

Checking the size of the image:

docker images
REPOSITORY         TAG       IMAGE ID       CREATED          SIZE
nginx             1.19.9     fda4b3bbf36c   22 seconds ago   1.22GB

As we can see, the resulting image is too large, more than 1.2 Gigabytes.

Why is the image so large? #

Building Nginx, requires a bunch of packages which Nginx build is depending upon (c++ compiler, linux development tools, pcre, openssl, etc), which we install in our when building the image. While Nginx is building, a numerous amount of archives are generated along with the nginx executable binary file, on compilation time. Don’t forget the source Nginx code that we downloaded in order to compile it and build the binary.

All the above mentioned packages, libraries and archives are all together now packed in our Nginx image, along with the Nginx binary.

We could of course delete and remove all these packages, but it is difficult to track what exactly was installed and generated on compilation time. Wouldn’t it be better to be able to just get the Nginx binary and move it on a fresh base image?

This is where multi-stage build come into play.


Multi-stage Nginx image build #

All we really need to run Nginx in a container, is the Nginx binary. Having a bunch of libraries and archives that are are used in different task of the Nginx compilation, makes us think that we can “brake” the nginx build into multiple stages.

So we can have one stage where we download all the Nginx and system dependencies – and one stage where we just copy the Nginx binary to a fresh base image.

FROM oraclelinux:7-slim  AS build_nginx_stage

LABEL description="Nginx Image build compiled from source code, using multi-stage build process" 
 
ARG NGINX_VERSION=1.19.9 
 
RUN yum groupinstall -y --setopt=tsflags=nodocs  "Development Tools" 
    	&& yum install -y tar gzip git gcc-c++ wget zlib-devel openssl-devel pcre-devel 
	&& yum clean all 
    	&& rm -fr /var/cache/yum 
	&& mkdir  -p /etc/nginx/binaries 
	&& mkdir /etc/nginx/modules 
	&& mkdir /etc/nginx/conf.d 
	
	#Downloading Nginx source
	&& cd /etc/nginx/ && wget http://nginx.org/download/nginx-${NGINX_VERSION}.tar.gz  && tar -zxvf nginx-${NGINX_VERSION}.tar.gz 
	&& rm nginx-${NGINX_VERSION}.tar.gz 
	&& mv nginx-${NGINX_VERSION}/* binaries/ 
	&& rm  -rf nginx-${NGINX_VERSION}/ 
	
	#Install Nginx
	&& cd binaries/ 
        && ./configure --prefix=/etc/nginx 
                 --sbin-path=/usr/local/nginx/nginx 
                 --modules-path=/usr/local/nginx/modules 
                 --conf-path=/etc/nginx/nginx.conf 
                 --error-log-path=/var/log/nginx/error.log 
                 --pid-path=/var/run/nginx.pid 
                 --lock-path=/var/run/nginx.lock 
                 --with-http_ssl_module 
                 --without-http_fastcgi_module 
                 --without-http_uwsgi_module 
                 --without-http_grpc_module 
                 --without-http_scgi_module 
                 --without-mail_imap_module 
                 --without-mail_pop3_module 
        && make && make install

FROM oraclelinux:7-slim

COPY --from=build_nginx_stage  /etc/nginx/ /etc/nginx
COPY --from=build_nginx_stage /usr/local/nginx/nginx /usr/local/nginx/nginx
RUN mkdir -p /var/log/nginx/

ENV PATH=/usr/local/nginx:${PATH}

CMD nginx -g 'daemon off;'

Explanation #

We identify the multi-stage build with the keyword “AS”, in line 1 when we specify the base image, meaning that this stage is the build stage where the download and compilation of Nginx is happening. We named this stage as build_nginx_stage.

Then, when the build is done, we specify another base image (with the keyword FROM), which is a fresh Oracle Linux 7 slim image with nothing more installed than the basic user space libraries for the system to function properly.

When we are in this second stage, (which we didn’t give it any name, since we are not going to have another stage after that to use), we are copying ONLY the few things we need for Nginx to be running in that fresh image (Using COPY with the argument –from, where we specify from which stage we are going to transfer packages from. In our case, we only copy the Nginx configuration directories, as well as the Nginx binary.

Results #

If we now check the size of the image we build using multi-stage build feature, we can see a significant drop to 170MB.

$ docker images
REPOSITORY    TAG         IMAGE ID       CREATED          SIZE
nginx         1.19.9      af20a24b717a   46 minutes ago   170MB
/*54745756836*/

Powered by BetterDocs

Leave a Reply