Retail analytics: today you didn’t buy condoms, and the store already knows when you will need a discount on baby food

That's somehow it works cunningly.
About your unborn child - this, of course, is exaggerated, but everything can be. In practice, we help retailers fight for each ruble using the mathematical apparatus . For example, you have a loyalty card in your wallet, or you pay with a credit card. This means that in general the store knows how many and what products you need. Then you can build the optimal model of your trip to the store and understand in which situation you will buy more. What should be where, what kind of milk do you prefer (all of a sudden you are ready to take expensive and natural without hesitation?) And so on. It’s easy to model you on the basis of data.

The same analytics can be applied to all aspects of retail.

From the ridiculous - once the system calculated that it would be profitable to destroy about half a ton of paper. At first they thought it was a bug - but they started digging and found out that the supplier gives a discount for a certain purchase threshold. And the network may not have time to sell the right amount of paper. Given the cost of the warehouse, delivery and the level of discounts starting from the threshold, it is easier to take and destroy a bunch of goods in order to receive it at a lower price. The discount at least doubles compensates for losses from its loss.

At the input, we have data from various sources (for example, cashier's checks, logs of access to loyalty cards, visitor counters in stores, data on warehouse loading, and so on). The answer for the business owner is the answers. For example, here are the questions:
  • What is the optimal amount of stock balance for each product?
  • How exactly do marketing campaigns affect sales?
  • Why on Saturday the 9th from 18:00 to 19:15 not a single bottle of wine was sold?
  • What sales results should a new product show to leave it online? How does this relate to the place in the window?

Plus, you can consider the KPI of each employee. For example, a seller. And by the way, at the same time, you can display him these data - for example, "you still have 5% left to the best employee of the month." Or output data from another shift to compete. In general, you can do anything with information.

Another BI for retail means the ability to analyze staff actions. For example, if the cashier gives an average receipt in 2 minutes, then you can find “the brakes” that do it in 10. And to understand why this is so — is he freebies somewhere or is there some technical problem?


We performed test analytics for a store with two entrances : it turned out that just blocking one storefront would be enough to cut off some new people “from the street,” but it would affect the increase in the check due to the correct route. The result - a little less people, and more profit. The route through the store turns out to be longer, people manage to see not only milk and bread, but in general everything in the assortment - and gain full baskets.

For the vodka company, we are now building the optimal routes for supervisors and merchandiseers . We optimize the movement of people in stores so that they can check as much as possible in a short time. Of the interesting things - if a merchandiser, for example, takes the subway, it is often more profitable for him to build a route according to the principle “went out, made at point A, crossed the road, made at point B”. But supervisors drive cars and they cannot be given the same route - for example, it can go around the railway for about 20 minutes. We can build optimal routes for everyone.

You can also build solutions for distributors and take out information about the sales of a particular group of goods from stores, appropriate manufacturer and so on. From the standard - kidney and sales analysis. For example, how does the placement of goods on the shelf affect its sales and where to put it optimally. A simple example. At the checkout you need to keep not only a set of “chewing gum - contraception - napkins - cigarettes”, but as it turned out, and especially in the mornings, these are remedies for a hangover. No one will go to the neighboring pharmacy or search on the shelves behind them, while the neat little box labeled “reduces the hangover” attracts attention just like the “bastard” in the evening 5 years ago. Well, about the arrangement of children's goods on the lower shelves and roaches closer to beer, you yourself know.

We are also engaged in the preparation of sales reports for franchisees and branches . There the situation is usually like this - the farther from the center, the less control. If you do normal analytics, the indicators immediately show where the problem may be. And for the central office it is important to know about it before serious losses begin, such as the fact that a ton of yogurt will disappear by the expiration date. There must be time for a reaction.

We also took the goods at 12 rubles and 50 kopecks. They decided to figure out what would happen if they raised the price to 13 rubles — it turned out that they would take it and would take it.

What is the basis?

In any case, at the bottom of all this useful information are primitives - checks, a description of the store, the approximate coordinates of the provision of a particular brand on the shelf of this store. Having this information, you can build absolutely any report, starting with how many per month there were sold per square meter, cashier, etc., and ending with the details, as the law on the prohibition of the sale of alcoholic beverages at night influenced. By the way, on such an analysis, we can not only recommend opening a separate cash register for alcohol at 22:30, before the night ban on the sale of alcohol, but also indicate the desired opening rule - closing the cash desks throughout the distribution network, taking into account holidays and events specific to the store’s location .


In general, BI is the collection of data for the use of analytics on it. In addition to reports, there is also a mat.apparatus with corresponding models that will manage, for example, not only the routes of merchandisers, supervisors, but also simulate the behavior of the buyer in the store. It is possible to make forecasts starting from the average check and what constitutes the maximum priority in it, what should be moved there in one direction or another in order to increase sales of this maximum priority.
In addition, the collective use of licenses is always cheaper. Even a simple video stream of the movement of the pupils of buyers is more profitable to buy in turn than to take cars for each store. It’s cheaper to drive traffic over the network, and video with the movement of the pupils when viewing goods on the shelves is good to load from the “cloud”.

Incident response

The system catches incidents and sends information to those who need to respond. For example, if at 2:14 a.m. there is a bottle of vodka in the check (and there is no system prohibiting punching such goods), you need to urgently respond until Roskomnadzor comes to visit. To do this, personnel officers are notified (they will fine the cashier for a violation), security officers (they will try to find out how it happened at all), senior points and IT specialists (so that no one tries to just wipe the transaction, if possible, and draw some write-off) . I think, in general, it’s understandable.

Another characteristic incident is a go-ahead for sale. For example, the system knows that if you do not sell 20 yogurts a day, some of them will be expired. Automatically a decision can be made like:
  • After 3 days, put the goods higher on the shelf (give instructions to the store staff).
  • After 8 days, make a 20% discount (update the base of prices and stocks).
  • Notify loyal customers about the discount.
  • If sales do not grow, motivate sellers with bonuses for this product.
  • If yoghurts are more than the forecast one week before the expiration date, activate the “buy 3 for the price of one” promotion.

It is clear that such things are considered in real time - the conditions are checked before each action, but I think the general meaning is clear.

Or maybe a very minor incident - some loyal customer stopped buying milk. He doesn’t take it in the store for 2 months and that’s it (it can be seen on the loyalty card and checks). In this case, you can give him a 15% discount on dairy products at a time - and notify about it.

Knowledge base

Business processes can be scaled. Here's another example: beer sales increased in one store, quite significantly. Through BI found the reason - the owner put the unit, which fries grilled chicken. As a result, this unit took some area, but paid for itself and increased sales of the neighboring category. It is logical to calculate and put this mechanics into the base - when designing the next point, you can know exactly how to optimally use the area and what equipment to use. In the case of a distribution network, the rule is entered as “look, we made such a controlling action, and our sales have increased”, and the system allows you to evaluate all parameters and transfer experience to other points.

How is this implemented?

All mate. the device lies in our cloud. The complexity of implementing BI systems is usually in data collection and the incredibly complex integration of all of this. We made it as simple as possible - there are connectors to all used systems like 1C, which simply isolate the necessary source data and send it for processing. And there is a mat.apparatus that is configured for the tasks of the business. Working with it resembles correctly formulated database queries.

Here is the data path:

  • Data is delivered via E-MAIL / FTP / WebServices (data from a point is downloaded in XML to FTP or mail, or a web service is opened that allows data exchange).
  • An automated process checks incoming data and fills the storage.
  • The business layer of the system filters, aggregates and visualizes data and generates reports.
  • Reports hosted on a secure web server are available for viewing by a business user.
  • Data from distributors is brought to the standards of a single NSI.

How it works? Well, for example, we give tools to clean the data, but we do not clean the data. We provide a data collection tool, but we do not collect the data ourselves. We give a tool for storing data, we store this data, the customer connects his system himself, he can use this data. The structure is universal for everyone.

There is a set of ready-made modules. We have a tool for creating fixed reports, a tool for creating automated scheduled uploads, sending by mail - a tool for any question. If there is no tool, and the customer tells us, “make a super-duper there beautiful report”, yes, we will make a report in this case - but we will give the module to all customers. We need a model - not a question, we will make a blank of this model, we will not select the parameters of the model, but we will teach this model the basic actions. A regular recount of the model will be done by the customer himself. For example, if a customer has new conditions for a marketing campaign, and the model does not provide for one of the conditions, we will expand it. But in case you need to exchange one product for another, the customer himself will easily do it.

Solved business tasks:

  • GUI for users far from the console
  • Top-down and bottom-up analysis
  • Simultaneous analysis of large volumes of information: over large time periods and comparison of aggregated data.
  • Control in accordance with the main management tasks: comprehensive data analysis, provision of routine reporting, flexible display of key indicators (KPI).

Solved technical problems:

  • Data cleansing
  • Data aggregation
  • Parallel data processing
  • Create and manage metadata
  • Data flow monitoring
  • Flexible regulation of information interaction with sources
  • The ability to audit the procedures and user work
  • Combining data from different distributors
  • Maintaining uniform NSI
  • Managing attribute composition of reference information
  • The ability to connect an unlimited number of directories
  • Protection of data transformation rules and connection parameters from unauthorized persons;
  • Integration with the existing security system;
  • Access to transformations and tasks based on security policies;
  • Protection of details of database users;
  • Distribution of roles;
  • Monitoring the integrity of data during storage, updating and transmission over communication channels;
  • Regular and backup.

In general, there are opportunities to do a lot. So far, the most popular services are data collection, storage, cleaning, consolidation of directories, reporting and mobile CRM. All this is done easily and quickly - well, as much as it can be easily and quickly in Big Data. Objectively, you can feel the impact of BI on a business in two months.


The base cost is the amount of space that we consume. Then - the number of channels to the data collection points (that is, the number of branches or suppliers). Setup work - it will be necessary to travel on points and connect. Plus other metrics, but they no longer affect that. Other metrics, such as the number of users and their activity, as well as merchandisers-supervisors and the number of mathematical models affect, but these are not basic parameters. If you need exactly for your situation - you can ask me in PM or by mail

You can write a report with all its logic: the most difficult report we do is no more than 10 days, and the simplest - 5 hours. And then count. And separately there is data visualization - many need visual, almost infographic reports.

The main task so far is the processing of Big Data, for example, following the results of a single six-month upload on the topic “what to buy for the New Year”. We have a good case, just like from an advertisement for washing powder. Our tests MPP DBMS, on quite ordinary office PCs in the amount of 5 pieces, showed comparable performance with SMP DBMS on the HiEnd server (to be more precise, 10 days on SMP versus 2 hours on MPP without optimizing the latter). There should be no problems with the peak of New Year's sales, as well as users’ nerves to receive the same report once a minute and launch the model just in case.

In general, companies - potential customers themselves understand the need for such a tool. Its payback is about 7-9 months. After all, even half a percent of additional margin or savings can mean millions of rubles per day (if we are talking about large retail, of course). That is, the service helps to earn and save.


% User%, such a system monitors you in a number of stores and knows about you, perhaps already more than Google. Of course, solely for your convenience, otherwise you’ll forget to buy milk :)
But seriously - the new analytics may already be so much that it scares in places. If the topic is interesting, I advise you to look at pieces of examples here at these links: