The right backup in the data center

EMC Avamar at CROC
Datacenter This hefty cabinet of several servers is called EMC Avamar. He stands in our data center, backs up, and makes it very interesting.

What's inside the cabinet?

Technologically, this is a block of x86 servers, now there are 10 of them. The architecture is as follows: there is a spare node and a control node, and the remaining 8 are written data. Given redundancy (the principle of the Hamming code, the uniform distribution of RAIN - Redundant Array of Independent Nodes), when any of the nodes fails, the data is saved. The spare unit at this moment replaces the killed one. In total, only 50% of each node is directly used in the system — the backup node, the parity node, and the second half goes to the needs of ensuring data safety. The physical capacity of the 200 TB array turns into 62.5 TB.

On each of the nodes is SUSE Linux OS and specialized proprietary software - the server part of the complex. The nodes are interconnected by internal switches that isolate external backup traffic from internal service traffic.

The structure of a single node is 12 disks, 6 of which contain basic data, another 6 - mirror them (RAID1), plus an ssd disk for the OS.

Where is the backup from?

The main purpose of EMC Avamar is a "hot" backup of the combat system from various sources:
  1. From the "cloud" (the "cloud" in the data center, in the "cloud" virtual networks, but you can backup from them).
  2. Physical servers in other racks.
  3. From virtual machines and physical servers of the customer’s infrastructure, extending its cable to the data center or via the Internet.

What are the benefits of Avamar?

Special features of the cabinet are as follows:
1. Deduplication. Data is stored in small blocks, and duplicate data is stored as links to the block. If you load 50 different text documents, which are essentially different versions of the same document, or are made on the basis of a single template, then in the process of deduplication documents are beaten onto a large number of blocks of variable length. Moreover, most of these blocks are repeated, since the basis of each of the documents includes a lot of information from related documents. All duplicate blocks are replaced by a link that is practically "weightless." This allows you to compress backup files up to 500 times, according to the manufacturer. In practice, at our customers we observe an indicator of 15-20 times file compression due to deduplication.

2. One of the coolest things of this particular hardware and software complex is deduplication on sources . That is, if a backup is made from your server, the determination of those pieces that need to be sent in fact is performed not after analyzing the “arrived” data already on Avamar, but directly on the spot, on the hosts themselves. This means that the first backup is 100% of the base volume (for example, 2 TB), and the second, third and subsequent in practice - about 0.1% - that is, about 200 MB each (in fact, an incremental copy). A backup of a remote office, a huge base or something like that in a minute is just a fairy tale.

3. Compatibility with different software . Specifically - with the main OS and application software. Why is it needed? Imagine a battle database where thousands of transactions are conducted per minute. If you start copying it “forehead”, then from the moment you start copying to the moment you end copying, the database will change - and irrelevant, erroneous and deleted data will get into the backup. A million transactions can take an hour - and you will get an excellent mess from the data, which can not be restored even by hand. Therefore, we need a softina agent that will make a base cast (“freeze” it for backup) and start copying this cast. In addition, the agent compresses the data and encrypts it during transmission. The magic wardrobe, as we have, comes immediately with a full range of agents.

General solution scheme:

What exactly is compatibility with?

System Software:
  • Microsoft Windows
  • HP-UX;
  • Oracle Solaris
  • Novell Netware
  • SUSE Linux
  • Red Hat Enterprise Linux
  • Apple Macintosh OS X;
  • Free BSD;
  • VMware ESX / ESXi.
Application software:
  • Oracle and Oracle RAC;
  • Microsoft SQL Server
  • Microsoft Share Point
  • Microsoft Exchange
  • IBM DB2
  • IBM Lotus Domino, etc.

How can this be used?

  • Additional backup . It is delivered as a service of another backup backup: given the convenience, the transition of all capital costs to operational, geographical remoteness and full automation is in great demand for storing almost any data.
  • The main backup (with restrictions) . For such an application, either wide channels or not very large amounts of data are needed - otherwise you will have to sacrifice recovery speed (after all, when rolling a backup, 100% of the base will go back through the channels, which can be very long for remote combat systems).
  • The main backup without restrictions . This is a rather unusual decision by CROC. It works like this: the infrastructure is deployed in your data center, EMC Avamar is installed in our data center. Backup to it is done through your standard Internet channel. We put another server in your data center - “mini-Avamar” - a virtual appliance. "Small" will be synchronized with the "dad" and keep the latest copies (the most relevant for rolling back). Older copies (rarely needed for quick backups) are stored on the main site. This appliance does not need to be bought: it is also paid per use, that is, all costs are operational. The scheme of the solution is given below.

Returning to the cost of the entire solution on our site - yes, it is really high. But this cost is divided into many customers, and because of this "communal" regime, the cost for an individual customer is reduced. Data is completely isolated from each other: you only see your backups.

Customer Interface Screenshot

Who is applying and how?

We have some interesting cases. Unfortunately, I can’t mention the names of companies, for now, like this:
  • One commercial company that often makes financial transactions, we back up with Oracle.
  • A large state-owned company stores backups of virtual machines from its "cloud."
  • The insurance company is testing the solution as the main backup.

Why is it reliable and convenient?

  1. Remote storage . This thing is far from the main infrastructure (it can be in a different machine room, in a different data center (if the infrastructure is connected to our data center), that is, at least not where the combat vehicles are deployed). This is a significant reduction in the risk of losing all information at once.
  2. Data is stored on disks . EMC Avamar is not a traditional tape library used in such cases, but a disk array, that is, information security is higher.
  3. This is sold as a pay-per-use service , that is, per gigabyte per storage. At the end of the month, an unloading is done according to the amount of data in the customer's account - the amount payable is obtained. This is a “cloud approach” and it’s convenient: capital expenditures go into operating expenses.
  4. Technical support is outsourced . And this is the normal support of the integrator: not the "FAQ-line", but a complete solution of working issues to the result.

So, if you need a reliable backup, come to us , we have EMC Avamar and cookies.