Creating a Backup and Disaster Recovery Plan for Your IT System
By EO|DC Member Luke Chung, FMS
As a small business owner, it’s easy to be consumed by day-to-day needs and neglect the importance of disaster recovery and business process continuation (PBC). Not only is it awful to think about, it requires detailed planning and thought to implement properly.
Over the years, we’ve seen organizations run into problems because of inadequate planning and testing, so here are some pointers we’ve learned:
Disasters Happen
Unfortunately, bad things happen. You may have business insurance to pay for replacing your hardware, but for most organizations the value is in the data. Customer lists and purchase histories, accounting, inventory, operational reports and processes, management decision making systems, and even simple Word and Excel files can be critical to the success of your organization. Making sure these are accessible in the event of an emergency is worth considering before a disaster strikes.
Disasters can be on many levels. Certainly, there’s the chance of your whole building burning down but the more common situation is hardware failure or human error. All hardware eventually dies and a hard disk containing critical corporate information could just fail. Similarly, someone could make a mistake and accidentally delete data or files that need to be recovered. It may not threaten the whole organization but could be expensive and time consuming to recover.
Overview
This blog can’t cover all the issues of disaster recovery, but here are some simple steps we’ve implemented in our organization and recommend for others:
- Create an inventory of critical hardware, software and files
- Establish a process for disaster recovery and business continuation
- Define who’s responsible for what
- Test the plan
- Review it annually to make sure changes are incorporated
Depending on how much of a disaster you want to prepare for will determine the amount of time and effort to expend now. It should be considered similar to an insurance policy.
Document What You Have: Inventory
The first part of a disaster recovery plan is understanding what you have. What hardware do you have, what software is installed on them, and what files need to be where. Dependencies between hardware (e.g. a program is linked to data from another machine), needs to be noted. Additionally, security information such as user names and passwords need to be documented at the operating system, application and file/database levels.
Locations of original setup programs also need to be identified. Basically, if you need to replace an existing machine, what would you do? This needs to be done for every critical piece of hardware.
Who is Responsible?
As the owner, you may have delegated all the technology issues to others in your organization but are you sure they understand they’re responsible for disaster recovery? Some people are focused only on hardware, others on application development, others on accounting, etc. It’s important to establish who’s responsible for what should things go wrong.
Making Backups
A disaster recovery process implies recovering from something. That means backups have to exist. There are some files that don’t change or can be found again (commercial software programs) but there are others which may only exist in your organization. It’s critical to have copies of these.
Most organizations will have people save their files on network drives and have an automated backup system take care of that. If that’s what you have, people need to know they must save their files in those locations. Backups can then be run on a regular schedule.
There’s also live backups of files as they change. Some people use check-in/check-out version control for files and platforms like SharePoint to do this. We use a hardware appliance from Sonicwall that manages all files on the network to make multiple copies of each file as they change without the user needing to do anything differently. This has proven extremely helpful when people mistakenly change a spreadsheet or PowerPoint presentation, and want a previous copy. We’ve set it to give us 5 copies of each file.
Backup Locations
Where you store your backup may be more critical as creating the backup.
A backup on the same machine (hard disk) as the original does not count!
If the machine or hard disk fails, the backup may be lost too. Store the backup somewhere else such as another network drive or a separate hard disk. Having it only on the same machine is definitely an unacceptable disaster recovery plan.
We use a hard drive docking station (under $50) connected to a PC that runs the backups. We simply insert internal hard disks (now under $80 for 2 TB drives) in them. Every night, we have a routine that run incremental backups and stores the files on that hard disk. A complete backup is run weekly.
Have an Off-Site Backup
It’s also important to have your backups available away from your production site. In case you can’t get into your building, or if there’s a fire that destroys everything, you need to recover and run your system elsewhere. This may be as simple as taking a copy of your backups and storing it at home (assuming that doesn’t violate security protocols). Make sure you have all the pieces that you’ve listed in your application inventory offsite.
Consider an Online Backup
You can also take advantage of online storage for backups. This can make it much easier than manually moving disks off-site and may let you keep more current copies offsite. There may be security concerns about another entity hosting your files, so make sure you don’t violate any privacy and security rules by doing this.
If the files aren’t too big, you can zip it up and email it to a free Hotmail, Google or Yahoo account. You can also create a Microsoft hosted Office Live / SkyDrive account where you can store up to 25GB for free. You can even synchronize your desktop to the web site. There are other paid storage services but they seem unnecessary given SkyDrive.
Test the Plan
There are countless stories of disaster recovery plans that were never actually tested. So when a disaster actually struck, people discovered that pieces were missing or backups weren’t being made after all. Recovery instructions can easily get out of date, so it’s important to avoid an unrecoverable failure by periodically testing the plan.
A critical part of a good disaster recovery plan is that the tasks performed are recorded and that if something goes wrong, notifications are made.
Maintenance Chores Evolve Over Time
Over time, applications and databases grow, and tasks that may have taken seconds or minutes could take much longer. Today, a midnight backup task finishes in an hour could grow over time to interfere when your office opens in the morning or just fail. It’s important to monitor and make sure the processes you’ve established are always working, and review them from time to time.
Conclusions
Make sure you take the time to review and consider what would happen if disaster strikes. With some planning both small and large disasters can be mitigated.
Hope this helps. Good luck!

June 12, 2011 at 8:46 am
Great article Luke!
July 13, 2011 at 1:39 pm
Thanks for sharing such Great information…