Monday, 8 July 2019

Reducing your Node Application Docker Image Size

Recently I happen to encounter memory/space issues quite often with a server that hosts Nexus (a repository manager that almost has universal support for all formats). On digging into the issue the prima facie evidence that we got was Docker Image size of our node applications are at alarming
high (~2.5 GB).

REPOSITORYTAGIMAGE IDCREATEDSIZE
app-static/ts1248f0e845f533 week ago2.47GB
node114051f768340f3 week ago904MB


Though we're certainly aware of "while architecting Docker applications, keeping your images as lightweight as possible has a lot of practical benefits. It makes things faster, more portable, and less prone to breaks. They’re less likely to present complex problems that are hard to troubleshoot, and it takes less time to share them between builds."  But, we had missed this point when it came to micro-services that are running in node.

We wanted to dig further to understand the areas that constituted major chunk/portion to bloated the size of our docker image. Our first step was to check the size of folders inside docker images. In our check we found the following:

local                  -     631 MB
application     -     704 MB  (contains node modules)
lib                        -     481 MB
share                 -     241 MB
.
.
.
Initially/never, we had suspected about the size of node modules as one of our primary developer (so as node fraternity out there in the universe) felt that it's quite normal with the node modules.


Now we had less options available for us and we wanted to focus on why we resulted with bloated images. I started to concentrate fully around docker this time and have the image built locally. First and foremost I wanted to analyze docker image layers as the base layer was just around 900MB. With the "docker history" command I could get the preview of layers that are built in the process of building the application:


On, seeing the history of layers (8 layers has been added on top of base node image) for the first time I opened the application docker file and found we had duplicated few lines (no one to be blamed. as it's sticking there for a while). I have fixed my docker file RUN section as follows to eliminate additional layers created while building docker image:

RUN npm install yarn -g && yarn install

Though number of layers reduced, the output for the same did nor turn positive as expected. This time again, I'm sticking with the docker history for analysis. Once, again I got some clues with the docker history.

  • COPY still creating leaving an significant impact (~500MB)
  • RUN is the major contributor on the impact (~800MB)
 COPY - ~500MB? Our application code in source control was no way near to few hundred MB, what's wrong here was the first thought that I got. As an answer I found that I had a copy of node modules in host and that been copied to container along the source code.

RUN - ~800MB? now that I have identified node modules in host just takes ~400MB why would the docker layer require 800MB? From, the senior developer I understood that we're handling node modules with source rather than distributes.

Ignoring either of one should help to reduce the size and circumventing second would give the best deal among the option that we had but that will have it's own side effects. To fix all the issues we did the following:
  • run `yarn install` in host
  • copy the source & node modules to the image
  • rebuild node modules to avoid target environment mismatch (in my case OSX was my host and node base image was in linux flavor thanks to the senior developer who cautioned/forseen  this issue)
Finally, my docker file looked as follows:

FROM node:11
WORKDIR /usr/application
COPY . /usr/application
RUN npm rebuild node-sass
HEALTHCHECK --timeout=1s --interval=1s --retries=3 \
  CMD curl -s --fail http://localhost:3000/ || exit 1
CMD ["yarn", "deploy"]
Now, my docker image is making lesser impression (now it's ~1.4 GB) when compared to what I had while starting this problem.