In reviewing the attached report, “Many Firms are Overconfident in their Disaster Recovery Ability” from Forrester Research, one of the responses really stood out to me. When asked “What was the cause of your most significant disaster declaration or major business disruption?” 17 percent cited “Not knowing when to “declare” a disaster and execute a recovery.” While this answer ranked second in total responses with 28 percent listing it as one of their three top reasons, it ranked first as most significant reason.
The assumption is, you can have the most detail oriented, well thought-out disaster recovery plan possible, but if you don’t know when to put it into action, it does little good.
At first glance, this sounds easy enough. You execute your disaster recovery plan when you… dramatic pause… have a disaster. And if your organization falls victim to a major disaster, be it an act of nature or a human error, such as the old “fiber seeking backhoe” cutting connectivity to a major data center, you will clearly need to crack open your DR plan.
But the reality is, these types of epic, force majeure disasters are far less common than the more subtle enemies of business continuity, such as hardware failure, general power loss, data corruption, human error and even malicious insider activity.
[Side note: I blame a lot of this confusion on the early days of backup and recovery and the over-use of the word “disaster.” I have no doubt it was originally chosen to incite fear and immediacy to sell equipment, rather than to instill a sense of importance to the, then in its infancy, data backup and recovery concept. But that’s just my opinion.]
So when do you execute your DR plan? As with most topics in this industry, there is no cut and dried rule to cover every organization and scenario, but there are some things to consider.
Understand the types of “disasters”
Again, in the world of data protection, not every “disaster” is something covered by the Weather Channel. Power failures and equipment (hardware, software, network) failures account for the majority of downtime, according to our own disaster recovery research. While proper attention, maintenance and planning can limit these types of failures to begin with, identifying them when they do happen is crucial.
Define what failure means to you
As part of your DR plan, you will surely cover what fault tolerances you have for specific applications, sites and data and how it will all be protected. Apply those to the “when” aspect of the plan as well.
Verify data integrity
One of the most common causes of data loss and failure is data corruption. A data corruption outage occurs when a corrupt hardware or software component causes corrupt data to be read or written to the database. It can take many forms and can be widespread or it can be localized. If you frequently check your data to ensure integrity, you may be able to contain corruption before it becomes a disaster and a trigger for your DR plan.
Lastly, and this can be a matter of personal opinion, try to limit hesitation. It most cases, it is better to trigger your disaster recovery plan too early, than too late. Some may argue that triggering your plan too early could exacerbate a problem – such as in the case of corrupted data – and send resources scrambling unnecessarily. But the alternative, namely waiting too long, can result in longer down times and greater disruption.
Disasters happen, be them big ones or little ones. But your reaction is what will determine the impact on your organization. In most cases, no one can blame you for these events that are out of your control, but the response is up to you, and whether or not you are the hero.