Image Optimization for web

image

There are enough articles and projects on the Internet to resize images. Why is one more needed? In this article I will tell you why we were not satisfied with the current solutions and had to cut our own.

Problem


First, let's figure out why we resized the pictures. We, as a web service, are interested in the fastest page loading for the user. Users like this and increase conversion. If the user has a slow or mobile Internet, it is extremely important that the pages are lightweight, not wasting user traffic and processor resources. One of the points that helps with this is resizing images.

We solve two problems. The first problem is that images are often not squeezed to the desired resolution, that is, the client has to not only download data that he does not need, but also spend CPU resources on resizing pictures using the browser. Solution: give the user pictures in the resolution in which they will be displayed in the browser.

The second problem is that images are usually not compressed well enough, that is, they can be encoded more optimally, which will increase the page loading speed without subjective loss of image quality. Solution: optimize images before returning to the client.

As an example of how to do it, you do not need to look at the main page of such a famous site as github.com . With a page weight of 2 MB, 1.2 of them are useless images that can be optimized and not downloaded.

image


The second example is our Habr. I will not give a screenshot, so as not to stretch the article, the results are by reference . On a habr to pictures change permission to the necessary, but do not optimize them. This would reduce their size by 650 Kb (50%).

In many places on the site, smaller versions of pictures are needed, for example, to show a reduced version of a picture of news in the news feed. We implement this as follows — only the picture in maximum quality is stored on our server, and if necessary, insert the updated version of it, add the required resolution at the end of the url via "@". Then the request will be sent not for the file, but to our resizing backend and returning the refreshed and optimized version of the picture.

Common solutions


All that will be said below applies to JPEG and PNG images, as These are the most popular formats on the Internet.

Having driven something like “image resize backend” into google, you will see that in half of the cases it is suggested to use Nginx, the other part is various self-written services, most often Node.js.

From nginx, and more precisely from libgd, which is used in the nginx module, we were able to squeeze on the test picture 63 RPS, which is not bad, but I would like faster and more flexibility. Graphicsmagick is also not suitable, because its speed is too low. In addition, both of these solutions produce non-optimized images. Most other solutions, for example on Node, suggest using Sharp for resizing, MozJPEG for optimizing JPEG images, and pngquant for optimizing PNG.

For a long time we ourselves used a samopisny bunch of Nod, Libvips and MozJPEG with pngquant, but one day we asked ourselves the question: “Is it possible to make resize faster and less demanding on resources?”.

Spoiler: it is possible. ;)

Now it’s good to find out how you can speed up our application. After examining the application code, we found out that imagemin, which was used for optimization, and in particular, its MozJPEG and pngquant plugins, when running, pull the utilities of the same name via os.Exec. We will definitely cut this thing out and use only bindings to C'shnyh libs. For resizing, the Sharp module was used, which is a binding to the Libvips C library.

Our implementation


Guglezh showed that Libvips is still the leader in speed and only OpenCV can compete with it. So we will use Libvips in our implementation, this is already a proven solution and it has ready-made bindings for Go. It's time to try to write a prototype and see what happens.

A few words about why Golang was chosen to try to solve this problem. Firstly, it is fast enough, but you still remember that we want to make a quick resize. The code on it is easy to read and maintain. The last requirement was the ability to work with the C library, this is useful to us.

We quickly wrote a prototype, tested it and realized that despite the larger number of internal twists than in Sharp, Libvips still produces non-optimized images. Something needs to be done with this. Again we turn to the almighty Google and find out that the best option is still MozJPEG. Here doubts begin to creep in, that we will now write the same thing that was on Node, only on Go. But after carefully reading the MoZJPEG description, we learn that it is a fork of libjpeg-turbo and is compatible with it.

It looks very promising. The thing is small - to build your own version of Libvips, in which jpeg-turbo is replaced by a version from Mozila. For the assembly, we chose Alpine Linux, because the application was still planned to be published using Docker and Alpine has a very nice package config format, very similar to that used in Arch Linux.
Image optimization reduced its size by 4 times without visible loss of quality.
Original JPEG
351x527
79 Kb
Optimized
351x527
17 Kb
greece_origin greece_optimized

Collected, tested. Now Libvips immediately when resizing gives an optimized version. That is, in the Node version of the version, we first resized, and then again passed the picture through decoder-encoder. Now we are only doing a resize.

We figured out JPEGs, and what to do with png. To solve this problem, the libpngquant library was found. It is not very popular, despite the fact that the console utility pngquant, which is based on it, is used in many solutions. Also, a binding on Go was found for her, a little abandoned and with a memory leak, I had to fix it, supplement it with documentation and everything else that befits a decent project. We also collected libpngquant as an Alpine package for easy installation.

Due to the fact that now the image does not need to be saved to a file for processing using pngquant, we can optimize the process a bit. For example, do not compress the image when resizing in Libvips, but only after processing in pngquant. This will save a little precious processor time. Needless to say, we also save a lot because calling the C library is much faster than running the console utility.

The difference in size is 3 times, but artifacts may appear (depending on the picture).
Original PNG
450x300
200 Kb
Optimized
450x300
61 Kb
bird_origin bird_optimized

An example of a not-so-good picture in which artifacts appear during compression.
Original PNG
351x527
270 Kb
Optimized
351x527
40 Kb
greece_origin greece_optimized

After the prototype was written, tested on my PC and gave a decent 25 RPS on the mobile two nuclear process, eating the entire CPU, I wanted to see how much I can squeeze out of it on normal hardware. We run the code on a six nuclear machine, set Jmeter and WTF ??? We get 30 RPS. We are trying to figure out what kind of garbage.

Libvips itself implements multithreading, that is, we only need to initialize the library and in the future we can safely access it from any stream. But for some reason Libvips works for us in 1 thread, which limits us to one core. Another 1 core is pngquant. In total, it turns out that our super fast resizer works perfectly only on the developer's laptop, and on other machines it cannot utilize all resources. ;)

We look at the source codes for Libvips and see that there CONCURRENCY is set to 1 by default due to data races in Libvips. But judging by the bug tracker, these problems have long been fixed. Put CONCURRENCY back, testing. Nothing has changed, Libvips still refused to resize images multithreaded. All attempts to overcome this problem failed and to tell the truth, I got tired of solving it and decided to get around the problem at a different level.

All more or less modern Linux kernels (3.9+ and 2.6.32-417 + in CentOS 6) support the SO_REUSE option, which allows multiple instances of the application to use the same port. This approach is more convenient than balancing with third-party software such as HAProxy, because It does not require configuration and allows you to quickly add and remove instances.
Therefore, we used SO_REUSE and the "--scale" option in Docker compose, which allows you to specify the number of instances to run.

Time to measure


The time has come to evaluate the result of our labors.

Configuration:

  • CPU: Intel Xeon E5-1650 v3 @ 3.50GHz 6 cores (12 vCPU)
  • RAM: 64 Gb (about 1-2 Gb used)
  • Number of Workers: 12

Results:
FIle Output resolution Node RPS Go rps
bird_1920x1279.jpg 800x533 34 73
clock_1280x853.jpg 400x267 69 206
clock_6000x4000.jpg 4000x2667 1.9 5.6
fireworks_640x426.jpg 100x67 114 532
cc_705x453.png 405x260 21 33
penguin_380x793.png 280x584 28 69
wine_800x800.png 600x600 27 49
wine_800x800.png 200x200 55 114

More benchmarks (though without comparison with the Node version) on the wiki page .
As you can see, we did not redo the resize in vain, the speed increase was from 30 to 400% (in some cases). If you need to resize even faster, you can turn the “speed” and “quality” knobs in libimagequant. They will allow to further reduce the size or increase the encoding speed at the cost of loss of image quality.

GitHub .
Go binding to libimagequant also on GitHub .