Best Practices For Backups: Restore Regularly

Hello!

Here’s one of my Golden Rules of Infrastructure:

If you haven’t restored from your backups, you don’t have backups.

It’s common to set up backups, check that the files or images exist and have sizes and contents that look right, then call the job done and move on. I’ve seen this leave plenty of folks with backups they can’t realistically restore from.

The Australian Cyber Security Centre says basically the same thing in their Essential Eight Maturity Model. Their top maturity level for backups has this requirement:

Full restoration of backups is tested at least once when initially implemented and each time fundamental information technology infrastructure changes occur.

100% agree. That’s the best practice I follow.

Here are some examples of failure cases that I’ve seen go undetected in environments with untested backups:

  • Filesystem tools like tar don’t always follow symbolic links automatically. That means things like this:
    tar -cf backup.tar ./important_folder/
    

    won’t give you all the files you expect if important_folder symlinks to files in other directories. I’ve seen plenty of backups missing critical files that were hidden behind symlinks.

  • Restoring from AWS RDS snapshots creates a new database instance. The new instance won’t be managed by your automation; you have to connect it yourself. If config like endpoints and setup like network access are managed dynamically, which of course they should be, it can take a bunch of hacking to get the new DB integrated in a way that isn’t reset the next time deploy automation runs (when it’ll re-look up dynamic values for the old DB). Today’s apps usually have tight SLAs and you won’t have time to develop this process in the middle of an incident. You’ll end up compensating customers for too much downtime.

It’s not enough to examine backup files, you have to go through the exercise of restoring from them and validating that the restored system passes all the same tests you’d run before deploying a new production environment.

Happy automating!

Adam

Need more than just this article? I’m available to consult.

You might also want to check out these related articles: