Optimization of long lists of boolean values ​​in JavaScript

Very often in web development (and in programming in general) it is necessary to save a long list of logical values ​​(yes / no, true / false, checked / unchecked and the like) as strings. For example, you want to write such data using localStorage, in a cookie, or send it in the body of an HTTP request. I had such a need hundreds of times.

Using array


We have two reasonable ways of organizing logical data in an array.
The first is to store true / false values:
[false, true, true, false, false, true, true]

The second is to store an array of zeros and ones:
[0, 1, 1, 0, 0, 1, 1]

Whatever method we choose, in any case, we will have to convert such an array to a string, and then convert this string back to an array when reading data. To do this, we can use the old methods Array#join() (or Array#toString() ) and String#split() , or more elegant JSON.stringify() and JSON.parse() .

If we go along the JSON path, the code will be a little shorter, although using JSON in this case is like cutting bread with a chainsaw. In some browsers, the impact of this approach on performance will be noticeable , it will also slightly worsen support for older browsers.

The main disadvantage of using array-based strings is their size in bytes. If you choose the option of storing an array of zeros and ones, then you have to use 2 characters per value (or, to be precise, 2n-1, one separator after each value, except the last):
[0, 1, 1, 0, 0, 1, 1].toString().length // 13 символов для 7 значений

Thus, for 512 values, 1023 characters or 2K are needed, because JavaScript uses UTF-16 .

If we keep an array of true / false values, things get even worse:
[false, true, true, false, false, true, true].toString().length // 37 символов для 7 значений

This is 5 or 6 characters per value, from 2560 to 3072 characters per 512 values ​​(from 5 to 6 KB).
JSON.stringify() consumes 2 more characters in each case for opening and closing brackets, but its advantage is that as a result JSON.parse() we get the original value types instead of strings.

String usage


The string allows you to save characters, because there is no need to use delimiters. For example, if we chose the digital approach and store the string '01001101010111', then we use one character per value, which is much better than the approach using arrays. You can put our values ​​in an array using String#split :
'01001101010111'.split(''); // ['0','1','0','0','1','1','0','1','0','1','0','1','1','1']

Also, you can simply iterate over characters in a string using a loop, using string.charAt(i) or even string indexes (string[i]) , if you do not need to worry about supporting older browsers.

Using bit fields


Have you thought about binary as well, considering the previous example? The concept of bit fields is quite popular in other programming languages, but not in JavaScript. In a nutshell, bit fields are used to pack multiple logical values ​​into bits of a logical representation of a number. For example, we have 8 values ​​(true, false, false, true, false, true, true, false). In binary, this will be 10010110, 150 in decimal and 96 in hex. Thus, 2 characters are used instead of 8, saving 75% . A single number in hexadecimal representation exactly corresponds to 4 bits. (Because 16 = 2 4. In the number system with base 2 n , you can pack n bits into each number.)
Thus, instead of storing an entire string and using one character per value, we can go a more smart way and convert such a string to a hexadecimal number. How to do it? Like this:
parseInt('10010110', 2).toString(16); // возвращает '96'

How to return data in a readable form? Nowhere is easier:
parseInt('96', 16).toString(2); // возвращает '10010110'

Now, like last time, we can loop through the values ​​and do something useful with them using a loop.

Is it possible to do even better?


Actually it is possible! Why convert to a hexadecimal number system, which uses only 6 letters of the Latin alphabet of 26? The method Number#toString() allows you to use base 36 - the number system with base 36 (if >= 37 we get RangeError ), which effectively uses all the letters from the Latin alphabet. Thus, we can compress 32 values ​​into 6 characters, which means 81.25% savings compared to the simple string method. The code is still simple:


parseInt( '1001011000', 2).toString(36); // возвращает 'go' (вместо '258' в шестнадцатеричном варианте)
parseInt('go', 36).toString(2); // возвращает '1001011000'

Many will stop there. But the more curious will say: “We still have capital letters and other symbols, we still do not use the full potential!” And they will be right. It is no accident that when you open a binary file in a text editor, you see strange characters on the screen, mixed with numbers and letters - upper and lower case. Each character encoded in UTF-16 takes 2 bytes (16 bits), which means that using the right compression algorithm, we can get a savings of 93.75%.
The problem is that JavaScript has no built-in functionality for using such an algorithm, so the code becomes a bit more complicated.

Packing 16 values ​​in one character


We will need a method String.fromCharCode . It takes a numerical value up to 65535 and returns a character (and for values ​​greater than 65535, returns an empty string).
Divide our line into fragments of 16 characters each.
You can do this with .match(/.{1,16}/g) . /a@1> . All code will look like this:

function pack(/* string */ values) {
var chunks = values.match(/.{1,16}/g), packed = '';
for (var i=0; i < chunks.length; i++) {
packed += String.fromCharCode(parseInt(chunks[i], 2));
}
return packed;
}
function unpack(/* string */ packed) {
var values = '';
for (var i=0; i < packed.length; i++) {
values += packed.charCodeAt(i).toString(2);
}
return values;
}

Not so complicated, right?
These few lines of code allow you to pack the above 512 values ​​into (drum roll) 32 characters (64 bytes) ! Much better than the original 2KB (with storage in a simple array), isn't it?

Limitations


Numbers in JavaScript have their limits. For the methods described here, which include intermediate conversions to numbers, the limit is set at 1023 booleans because it parseInt('1111…1111', 2) will return Infinity with more. This restriction does not apply to the latter method, because we convert blocks of bits instead of the entire string. And, of course, this restriction does not apply to the first two methods (array and string) because they do not include packing values ​​at all.

“I think we have gone too far”


Yes, in some cases such optimization may be unnecessary. But if you need to save a bunch of logical values ​​in a limited place that supports only strings, these methods will be very useful. And the optimization of something that is transmitted over the wire with great frequency is never superfluous. For example, cookies are transmitted in almost every request, so they should be as small as possible. Another example is multiplayer online games, for which the server reaction should be lightning fast, otherwise playing such a game will not be fun at all.

And even if such a deep optimization is not for you, I hope this article has given you some food for thought and perhaps taught you something.

Transfer. Original (en):
Optimizing Long Lists Of Yes / No Values ​​With JavaScript
Original author: Lea Verou