Backup - From a SysAdmin Perspective

Backup is big topic that brings a lot of opinions and emotion with it. I hope to explore some of these in the remainder of this publication. Before I start, lets get a preview of what is coming:

What is a backup/backing up?
Management's Perspective
Customer's Perspective
SysAdmin's Perspective
Attributes of a Useful Backup System/Plan
Some Different Approaches
Rsync incremental Backup
Dirvish
Future Enhancements
Links

Things to add:

scalability (if you need more than 4 x 4TB), you probably will run out of bandwidth/backup window. Scale by adding servers
sample yaml files

What is a backup/backing up?

Basically everybody knows what a backup is. Specifically is an entirely different matter. One definition might be:

The ability to reproduce data/information/system/service/environment regardless of state of the existing data/information/system/service/environment.

That is a pretty good definition that avoids many of the pitfalls of most backup definitions. First, it tries to exhaustively list all possible things that you might need to backup. Secondly, it carefully avoids describing the mechanism of reproduction (which would imply the nature of the actual reproduction process/data).

This avoidance was intentional. Most people think of a backup as a copy. This is not necessary. Many things can be reproduced without making a physical copy. You can produce a clone of a Linux system by performing an appropriate software. If the system has any customization since install, you can document or automate the steps to reproduce the changes. Finally, the data that is truly unique must come from somewhere. That somewhere could be a copy. It could also be a process/service that recreates it (example: a process could query an ldap server with a list of ids and reproduce an addressbook).

Based on this approach, backing up is the process that ensures the ability to reproduce data and or an environment as it was. If that environment was a web server, you would minimally need the following:

A way to get the base OS on hardware
A way to add the necessary software
A way to configure the machine (configure apache)
A way to reproduce the data the web server was serving

This could be simple. You could backup the machine by dding the entire hard drive to another drive. Recovery would be dding the backup image to the new machine.

It can be more complicated. It can include customized installation (say kickstart install). The next step would be to install additional software and to configure that software for this specific instance. The last step would be restoring content via an appropriate process. This last step brings to light some interesting questions that will be be addressed later.

Management's Perspective

Management's perspective will start out that backing up is what you do so that you can recover in the case of catastrophic failure. Over time it will be refined.

Sometimes the refinement will take the constraint form: Can you reduce backups? This form come about for reasons such as the following:

The downtime for backup is impacting business
The cost of backup (media/systems/personnel/...) is becoming expensive
Keeping lots of copies is just a waste
Simply backup changes -- quit doing wasteful full backups

At other times, management will want an increase in backups:

We need a year's worth of backups -- not just 3 months
We need hourly backups -- not just daily
We need complete recoverability

Regardless of the current direction, the management always assumes that the SysAdmin should be able to restore a lost process/data/server on demand in a short period of time. Management will assume all necessary hardware, software and expertise exists and can be used at a moments notice.

Customer's Perspective

Most likely, the customer will be oblivious to backup. At least until the system crashes or until they delete something they should not have. Suddenly, the user expects the magic backup to reproduce the document he was just editing and accidentally deleted. And he wants it like it was a few minutes ago -- not how it was yesterday.

SysAdmin's Perspective

The SysAdmin's perspective will be a lot like the management's perspective -- sort of. The admin will not be any happier than the boss about the time, effort, and resources that backup consumes. The admin will want a simple backup system that can reproduce any needed item without needing a lot care and feeding.

In particular, the admin wants an easy to use system that just does what is necessary. The backing up side is okay and almost any system will work.

The recovery side is the one that often causes an admin to get frustrated. You use the backup side fairly frequently (every time you add a new system or change a system). If you are lucky, you rarely have to recover data. The down side is trying to properly recover data (dealing with arcane commands in a syntax that is often very different form what you are used to).

Many SysAdmins have a little pit of fear (or at least a little bit of uncomfortableness) when facing the prospect of recovering information. Some of this comes from the stress of the situation. Some of it comes from the buried realization that a little more time and effort in backing up woudl have made recovery a lot easier. Trying to figure out a cryptic recovery process in times of stress can be overwhelming.

Attributes of a Useful Backup System/Plan

A backup system/plan should have many attributes. Here are a few that come to mind:

Easy to configure
Easy to backup with
Minimally disruptive
Adaptable to different conditions
Low overhead
Easy to monitor
VERY EASY TO USE WHEN RECOVERY TIME COMES

If a backup system is hard to configure it will be misconfigured, or maybe not even used. Your backup system needs to be as similar to your day to day life as possible so that you can use it stress free.

The harder it is to do a backup, the less likely that you will have the backup you need.

If the backup process causes undue downtime or disrupts the business flow, it will not be done properly. Shortcuts will be taken that will undermine its effectiveness.

If the backup system is not flexible, it will not support all of the various things that you need backed up. Ideally it should backup database servers just as easily and happily as it does user desktops. Lack of flexibility leads to complexity. Complexity leads to mistakes and inadequate backups.

Most SysAdmins do not have the luxury of having a completely duplicate hardware environment for backup (not that this is easy to do). Economics matter. The backup system needs to be able to adjust to different economies without becoming inadequate. If the cost of backup is high, you will backup less. Backup needs to be inexpensive enough that you can backup up everything. Every time you decide what not to backup you risk not backing something important up.

Monitoring of backups is often forgotten. If you are not checking every backup every day you are asking to discover that your backup failed at the worst possible time (when you are trying to recover).

Recovery is when a backup system proves its worth. You should be able to easy locate the file(s) you want and easily restore them to a system. To me, this means not having to use a recover command I am unfamiliar with.

Some Different Approaches

How would you replace a dead server? One solution might be the following:

Acquire hardware
Install base OS (maybe using kickstart to get close)
Add necessary additional packages
Customize
Restore data

Step one is beyond the scope of this talk.

Step two might be an installation CD/DVD or a cobbler server for CentOS.

Adding additional packages can be a pain. Knowing what to install is very important. Note: make sure backup procedure includes package list. Good config management can help here (I think some Salt may be in order).

Customizing a machine can be killer. Configuring authentication, configuring iptables, configuring databases, configuring apache, tuning, ... Once again, good configuration management is your friend.

Now it is time to get the actual data back. How do you restore just the data and not mess up all that config you just did? How do you confirm your config is correct?

Wouldn't it be nice if your backup was just a big directory tree and you could wander through parallel trees until you could find what you wanted? Wouldn't be nice if restoring was just an rsync or copy a tarball away?

Wouldn't be nice if your backups were all in Time Machine and you could simply find what you want and then drag it back?

Rsync incremental Backup

Rsync is a tool for synchronizing two things (files, directory trees, ...). Simple operation of rsync causes a source to be examined and compared to a destination. New files that are discovered are replicated. If you choose the delete option, files in the destination that are not in the source are removed. Rsync is that simple.

Unfortunately, to accomplish the simple idea illustrated above, rsync has a LOT of options and flags. An rsync guru almost looks like a magician. The good news is we do not need to worry about this aspect.

Rsync is being brought up for a couple of reasons. It is the basis of the magical Time Machine backup process Apple has implemented (the sort of dream alluded to earlier). Rsync is the underlying tool Dirvish uses to perform its backup tasks.

Most importantly, we need to talk about a special feature that rsync has. Normally, rsync merely clones a tree. If you made a complete copy of your system every time as your backup procedure you would need a lot of storage, and a lot of bandwidth.

Rsync has the concept of a base tree. When you clone source to destination, you can refer to a base tree. Rsync will compare the source to the base. Any files on the source that are not in the base will be cloned as you would expect. Any file that is in both and that has not been changed, Rsync will make a hard link between the base tree and the destination tree. This means that if you have 90 days of backup and /home/isaac/resume.txt has not changed in 90 days, there will be copy of the file in the base tree and 89 hard links pointing to that copy. Non-changing files only need a single copy (beware, you need to increase inode count on your destination backup filesystems). Imagine just one copy of every unique file and a complete directory tree that looks like the original for every single backup instance.

That is the promise of rsync. Unfortunately, a lot of work is necessary to achieve this dream.

Dirvish

At this time, I would like to introduce you to Dirvish. Dirvish is Time Machine without the GUI (at least that is how I think of it). Dirvish has many benefits:

It is free, open source
Dirvish is simple (relatively)
Dirvish is based on rsync
Dirvish does not use a database
Dirvish is stable (1.1 released in 2003)
Dirvish can be easy to configure
Dirvish can be easy to maintain
Dirvish can be easy to monitor
Dirvish backups are easy to inspect
Dirvish is relatively fast and lightweight (moving data takes time)

Dirvish Introduction

Free is free. You can look at the source and do whatever you would like to it.

Relatively simple sounds a little like relatively cheap -- it depends on what you are comparing it to. Dirvish is built on top of rsync. Basically you make a clone of the data you want to backup with Dirvish to get started (initializing your bank/vault). Each future backup uses the latest backup as a base image and builds the differential directory tree with hard links for unchanged files.

Since Dirvish is based on rsync, its basic operation is easy to understand. It also means that no magic, complicated code is needed. Dirvish is basically a wrapper for rsync (Poof! -- all the magic smoke just leaked out).

Since the backup is a complete directory tree, the filesystem replaces the database that most backup systems use to keep track of files that have been backed up. Hard links replace the links in the databases to keep track of where the master copy of a file is.

Rsync is a known commodity. It is in heavy use and rarely has a serious bug. Dirvish is old and sees very few changes (it does what it needs to and needs little updating). Mature products are nice.

Normal configuring of Dirvish is pretty simple. You add a vault for a new file system (or directory tree) that you plan to backup. You define any special pre or post scripts (such as a database dump in your pre). Then you install a backup SSH key on the client. I will provide more config info later.

As before, monitoring the operation of a backup system is critical. Dirvish produces a log file of what was backed up for every backup. It also produces a summary listing anything that might have gone awry (such as a file vanishing before rsync has time to copy it or running out disk space or inodes, ...). Monitoring can be as easy as greping all the summary files for keywords and sending yourself a daily e-mail.

Since Dirvish creates a new tree for every backup (think of it as a snapshot), examining a backup is identical to examining a system. You use cd, ls, find, ... You can even let updatedb create file databases so locate will work.

Because Dirvish only copies changed files, it performs at the level the server, the client and the network will allow. Since there is no database, no delays are incurred during backup beyond the overhead of file/hardlink creation in a file system. Dirvish does not need a lot of memory. Lower cost disk can be raided to provide performance gains.

Dirvish Overview

Dirvish has a number of basic concepts that you should understand before you configure it. Here are some of those ideas:

bank
vault
exclude
pre
post
expire

A bank is a collection of backups. It might correspond to a machine or a group of machines. A bank is also the path to where the backup will be written (the first part). A vault is an individual backup (say a file system or a directory tree). A bank will contain one or more vaults. Backups are done one vault at a time. Each vault belongs to a single bank and its snapshot trees live below the bank path.

Exclude is just what you think it is. It is a list of things to not back up. You can have a global exclude and a separate exclude for each bank. Preand post are also reasonably logical. Each bank can have a action/script that is run before (pre) the backup and another action/script that is run after post the backup completes. Pre might be used to do a mysqldump before a backup starts.

Expire is also logical. When you create a bank, you define an expiration time period. Expire may have a system wide default as well as a per bank setting. Expire is done via a cron entry that runs dirvish-expire (a script). dirvish-expire will examine each bank and detect if a snapshot is older than the expiration time limit. If it is, it will delete that snapshot (tree). Any file that only exists in that snapshot is deleted.

Backups are run via cron also. dirvish-runall is a script that performs the snapshot backups for each bank.

Dirvish pits its configuration files in /etc/dirvish. Here are the normal config files:

master.conf - main config file
bank.conf - definitions of each bank
vault.conf - definitions of each vault
scripts - directory where pre and post scripts live

After all the config is setup, you must run a single initial backup for each bank manually. This creates a first base tree for rsync to hardlink from. Once the initial backup is done, the dirvish-runall script in cron will do nightly (or whatever you set) backups.

My Dirvish Implementation

One of the strengths of Dirvish is its simplicity. Simplicity also often means incomplete or not robust. I have found some things lacking in Dirvish. Here are some of the ideas/additions we have done:

Hardware design
Software design
Yaml config file and python autoconfig script
Disk usage script: ndf.bash
File system growth script: increasefs.bash
Hardlink script: hardlink.bash
Monitor backup success/failure: Xymon plugin
Finding files and locate database customization

Here are the specs on my primary Dirvish backup server:

Dual Xeon processors running at 3 GHz
2 GB ram
80 GB IDE system disk
Promise TX4 Sata controller (4 sata 300 ports)
4 Seagate ST3000DM001 3 TB hard drives in a software raid 5 configuration

The Dirvish backup server runs CentOS 6.4. The 4 3 TB drives are joined into a single software raid 5 set. This is then used as the pv for the lvm volume group dataVG which has just over 8 TBs of space). I then create a new LVM for each bank (insuring that one backup cannot prevent another from having disk resources).

One of the drawbacks of Dirvish is the work to add a new backup (it is not hard, but it is a little bit tedious). To ease this process, all the config has been condensed to a single yaml config file. We then have a python script (dirvish-configure) that decodes the yaml and finds any differences between it and the actual Dirvish config and does the updates. If the necessary file system is missing, it will ask you for size and create the file system, update /etc/fstab and mount the file system. Finally, it will print out the Dirvish init command for any defined bank that does not have any snapshots (really useful).

We have a script called ndf.bash that basically does a df. It actually does a space and an inode df and merges them together in one line. It highlights file systems over 80% full in yellow and over 90% full in red. If you pass over as an argument, it only shows file systems that are 80% or more full. Useful when looking for which ones filled up after backups.

We have a script called increasefs.bash that expects two arguments: size and lvm-name. The size is in GBs (decimals allowed). This is the amount the filesystem will be grown. lvm-name is the full path of the logical volume (/dev/dataVG/exampleLV). It then grows the indicated filesystem the requested amount while online.

Dirvish automatically hard links two identical files during the rsync process. But if you move a directory with 100s of files, Dirvish will not know that the files are still the same. We have the script hardlink.bash (needs the rpm hardlink-1.0-10.el6.i686, or some such) that expects a bank/file system path as an argument. It will then look through the entire file system for duplicate files. Each time it finds a dup it will delete one of them and create a hardlink to other copy. Basically it does de-duplication within a single file system.

I use Xymon to monitor systems and services (a future talk in the making). I have a script (check-dirvish.bash) that examines the summary files of each backup and produces a synopsis of the summary into a series of files. The Xymon client plugin looks at the files and produces a single web page summary that highlights any failures.

When Dirvish creates a backup, it will optionally create an index file. This is basically a ls -l of everything that was backed up. Dirvish provides a dirvish-locate script that can use this file to find files (much faster than the find command). This was not good enough for me. At first, I simply let mlocate index everything. The mlocate database got very large and it took forever to rebuild each day. We are currently working on a script (that will be croned) that will update a separate mlocate database for each bank (file system). This will allow locate --database path-to-database to be used for each bank. The script we will cron is called mlocate-db-update.bash.

Future Enhancements

Since a SysAdmin's work is never done, there must be more things to do. Here are a few I can think of:

Daily usage report
Filesystem growth trends (have increasefs.bash log every increase and graph that)
Backup duration graphs
Add a pre script that does the following:
- makes an updated rpm list
- dumps network config
- dds the boot sector of the hard drive
- anything else I can think of that might be useful in restoring a system
Maybe a web frontend for partial restore