Odnoklassniki social network business analysis
This post is about statistics systems in Odnoklassniki. It will talk about why we need statistics, and what systems we have to work with it. In the following posts we will describe in detail:
• system architecture;
• main components of systems and algorithms;
• non-trivial problems and ways to solve them
Why do we need statistics?
We need statistics in order to know everything about the work of our site. This knowledge allows us to:
• develop services not at random, but purposefully increase the site’s performance and user activity;
• evaluate the success of any development, whether it is a new service or simple code refactoring;
• track the work of the site as a whole;
• control the operation of each of the components of the site and their relationship;
• investigate anomalies in the operation of individual components of the site;
• investigate abnormal changes in user activity;
• analyze the target audience of our services.
What data do we collect
To really know everything about the work of our site, you need to collect a lot of information. Every day we process more than two trillion (2,000,000,000,000) records. Interested parties can access the processed statistics with a lag of 2-3 minutes, that is, almost in real time.
We log in:
• any user activity (any click);
• any call to any non-trivial component of the site (for example, a java class and method);
• any relationships between site components (for example, accessing the cache from a web server).
We also upload data about user content (for example, uploaded photos and ratings) and activity (for example, logins or visits) to the statistics system to analyze user behavior in detail.
Chart and Deschboard System
We look at statistics mainly in the form of graphs. These graphs most often come in two forms:
1) a graph where one point is the aggregated data for five minutes. These are graphs for operational monitoring.
2) a graph where one point is aggregated data for at least one day. You can specify other periods - week, month, quarter and year. These are charts for tracking long-term trends.
With one click, any chart is put into interactive mode, in which you can change the parameters and filters of the chart and immediately get the result.
The main features of the charts:
1) choose any date and any period (the whole story is available);
2) select any parameter - for example, the number of calls, average execution time;
3) group by any classifier, for example, by servers or by java classes;
4) put any filters - by value, by a list of values;
5) process with mathematical algorithms - for example, smooth the schedule;
6) will switch from a 5-minute chart to the daily and vice versa;
7) save the customized schedule in deschboard as new or rewrite over the old one.
From these graphs, we make thematic thematic desks. To do this, we wrote a web application on ASP.NET.
An important aspect of this system is managers, developers and administrators who work with charts and make dashboards, and not the business analysis department. Business analysts only provide tools and ensure system performance.
Traditional reporting system
We also have traditional reporting: these are static reports that show the requested data in the form of tables or graphs. The business intelligence department creates them on request. These reports are integrated into the system of charts and desks.
We will not consider reports - they are not interesting in and of themselves. Consider the methods of processing large amounts of data (for example, we had almost 30 billion logins in 2011).
Multidimensional Analysis (OLAP) System
Often it is more convenient to analyze long-term trends in the pivot-table style, where any parameter can be divided into components in the form of a table.
Therefore, we create OLAP cubes on various topics, for example, payment cube, login cube and others. To work with cubes, we wrote a web application.
Automatic Anomaly Detection System
Viewing chartboards with charts is not the most effective way to monitor. We have created a system that “looks through” the graphs, and if it detects abnormal deviations from the “norm”, notifies about it by letter or SMS message.
System for automatically detecting bottlenecks in infrastructure
We have a fairly large fleet of equipment - more than 4000 servers, for which we use various specialized systems - Cacti, Zabbix and others. When the use of a resource reaches a critical level, administrators are automatically notified about this.
In order for the load on the servers to reach a critical level as rarely as possible, it is necessary to regularly analyze the operational performance of these servers and make appropriate decisions, for example, on adding additional capacities. Manually performing such an analysis manually is very difficult. Therefore, we wrote a module that uploads data from Cacti systems to the data warehouse and iterates over the operating indicators for the previous couple of weeks on the subject of “how soon will we reach a critical level if everything continues in the same vein”.
In 2008, when they decided that it was necessary to build a business analysis system, Odnoklassniki widely used MS SQL databases. Therefore, the logical choice for us was to use this platform for business analysis.
Today, as a data warehouse, we use only MS SQL 2008 R2 Enterprise Edition. We are planning an upgrade to MS SQL 2012. Of course, we will inform you about the results.
We tried using MS SQL Integrity Services, but this technology turned out to be too disadvantageous in terms of labor. The same code can be written in T-SQL several times faster. Therefore, 99.9% of the data processing code is written in T-SQL, and the rest is in .NET.
For multivariate analysis, we use MS SQL 2008 R2 Analysis Services. We also plan to upgrade to MS SQL 2012. We have written a web application for the front-end.
Chart and Report Generator
We use MS SQL Reporting Services to generate graphs and most reports. Initially, this was a very good solution, because there it was possible to quickly build the necessary minimum functionality. But now the requirements for the system have grown - we need a richer, faster and more dynamic user interface. Therefore, we plan to migrate to another solution (we have not yet chosen which one).
All web applications are written in ASP.NET. We use DevExpress UI components that allow you to easily and quickly create good shapes in a single style.
In the next post, we will talk in detail about how we log information and deliver data to DWH.