Analysis of the results of the presidential election in 2018. At the federal and regional level
A feature of the Russian presidential election in 2018 was that the main indicator now is not the percentage for the main candidate, but the turnout. Another important indicator was the record high number of observers across the country. Observers were sent, including to the republics of the North Caucasus, where traditionally the results were simply drawn.
Presidential elections, even without taking into account the large-scale campaign to increase turnout through competitions, local referenda, and administrative pressure, attract much more attention than parliamentary elections. However, after analyzing the results, one can continue to notice anomalies in the results, albeit already less pronounced at the federal level.
If we consider the distribution of the share of votes for candidates depending on the turnout, you can already see the classic situation where the higher the turnout, the higher the share of votes for the main candidate, striving for 100% and 100% turnout. The share of other candidates, respectively, is proportionally reduced. All pictures are clickable.
You can also recall the many mathematical studies showing a rather unusual distribution of sites depending on the appearance at them. Instead of a normal or lognormal distribution, we see an interesting curve, with very strange peaks at round values (70%, 75%, 80%, etc.), increasing by about -100% turnout and going far up by 100%. Here are the results of the 2012 presidential election:
In the 2018 election, the curve looks even weirder. The number of sites with a turnout of more than 80% increased significantly, and the peaks at round values increased even more.
It is noteworthy that high turnout is observed not only in small polling stations: there are cases when 90% -100% of the turnout is in the polls to which 1,500-2,000 people are assigned. Here are the results of the 2012 election:
And here are the results of the 2018 election:
As you can see, the number of sites with 100% turnout has decreased, but now they are distributed in the turnout interval of 70% -100%. Here you can also see a more pronounced distribution of PECs on round numbers, such as 80%, 85%, etc.
By the way, rounding lines from dots on a low number of voters in polling stations is normal. The fact is that due to the strongly discreteness of possible turnout values for small numbers of voters, a picture is observed in which the number of possible turnout values decreases with a decrease in the number of voters. Here, I even made a graph from the test data received by the randomizer:
Analysis of elections at the regional level
But it is much more interesting to consider the data on elections by region. For example, the Republic of Ingushetia. The vast majority of PECs are very conveniently located in the region of 80% turnout, ± 2%.
Another anomaly is observed in the Samara region: there, in areas with a turnout above 80%, the share for the main candidate, although increasing, is not significant. Moreover, approximately in this sector the share of votes for the two subsequent candidates increases, of course, at the expense of all the others. You can also see that the sections of the turnout are not distributed normally and not lognormally, but in the form of two “mountains”.
One would think that this is due to the difference in the voting between the city and the village, but in Chuvashia the situation is even more significant, because there are lost polling stations with a turnout of 77%, but the second cluster of PECs, with about -100% turnout, is 2 times more, than with natural turnout.
As for Chechnya, where 100% results have traditionally been drawn, many artifacts appeared in the presidential election in 2018 with an unusual, not 100% turnout. Most likely, of course, this is the result of the work of independent observers from all over the country who traveled to Chechnya to observe the vote. In some areas, turnout decreases of up to 40% are noticeable, in some areas turnout dropped to 90%. All this once again shows the scale of fraud in the republic.
Also, one cannot ignore the election results in the Kemerovo region, where in 2012 there was a significant number of polling stations with exactly 80% of the share for the main candidate, and now PECs are divided into two types:
The first type (there are few such polling stations) are those with low turnout.
The second type (and there are a lot of such) are polling stations with a turnout of 75% to 100% and an equally reduced percentage of votes for other candidates, except the main one.
However, there are also examples of regions in which anomalies are poorly represented, which makes it possible to make an assumption that there was a single falsification. For example, in the Tver region, despite peaks of 70% and 100% and other minor anomalies, in comparison with the regions described above and general statistics for Russia, the distribution is quite similar to the lognormal one.
Development of an election analysis service
After looking at the various charts for the federal level, I wanted to repeat them, but only for my area. I found a repository with code for ipython that built similar charts, but either there were some errors in the code, or the electoral commission’s site was updated, or something else, but the parsing of the electoral commission’s site failed.
Then I left only the data visualization from that code, and completely rewrote the page parsing. After that, I managed to get a picture with the results, which I wanted, but then I had the desire to be able to move the cursor to a section and, at a minimum, get its number, so that I could find detailed information. But this was impossible, which is why the idea arose of creating a service for this type of analysis of any election for any region.
I didn’t want to develop a desktop application, because it would raise the entry threshold for users, and I could bring unexpected problems on platforms on which I did not test the program. This is the same application that the user needs every day, because this option is not suitable.
Since the candidates are changing so far in the elections, to simplify the work with the data, it was decided to use the NoSQL database, and specifically MongoDB, as the most well-known one. I didn’t work with her before, and I regretted it, since Mongo really liked it, although I had to adapt for some time after traditional SQL and casting the data to the third normal form.
Thus, in about a month and a half, I created a beta version of the service, which I presented to the public, after which I also made various changes. For example, before downloading data from the electoral commission’s website took about 12 hours, and after the introduction of asynchrony and massive delayed addition of information to the database, this time was reduced to three hours. The main time, of course, is the procedure for loading pages. For some reason, GAS "Vybory" does not want to quickly give pages, often loses connection. In this regard, I had an idea to provide APIs for convenient data loading by third-party applications, but I'm not sure how interesting this is to users.
My service allows you to see the results in areas of any territory for any (so far 4) choices are made, 4 of the most important graphs of the dependence of various parameters on turnout are generated, even the addresses of PECs are shown, when selecting a group of points you can see where these sections are located. I didn’t parse the addresses of PECs, but I used the ready-made database , which was downloaded by the guys from GIS-Lab.
UP (April 16): At the request, I posted a dump of the service database. Download . I also remind you that it is possible to download this data from the electoral commission website itself. Added a small instruction in readme to the project repositories.