Most CIOs at enterprise-level companies are in on the dirty little secret of disaster recovery (DR) testing: The...
traditional DR test method is outgrowing its usefulness. The complexity of today's environments makes true simulation of recovery from a disaster quite difficult.
CIOs aren't abandoning the method -- there are as yet few alternatives -- but analysts say they would be wise to incrementally increase the scope of testing and look to tools to monitor software configuration changes to increase effectiveness.
"I wouldn't say the classic model is becoming obsolete quickly, but it is becoming obsolete [due to the complexity issue]," said Mike Summers, managing director for the disaster recovery practice at Computer Sciences Corp.
The classic disaster recovery testing model doesn't fit today's complex IT environments with multiple operating systems, diverse Web applications and multiple business partners, said John Morency, vice president and research director at Gartner Inc., a Stamford, Conn.-based IT consulting firm.
A truly comprehensive disaster recovery test plan would include business partners' response times -- for example, how quickly those in your supply chain must get their virtual private network and shared applications back. It should also include Software as a Service (SaaS) suppliers' contingency plans to recover their applications, if affected.
But it's unlikely that a company could perform an effective test touching all of those entities. "For example, a financial services organization might have 10,000 partners, which far outstrips the ability of most companies for DR testing," Morency said. There are too many related applications and too many applications, period, he added.
"The question is, can you test every business process and every application and all of the related data in the time frame of 24, 48 or 72 hours?" For most companies the answer is no, Morency said.
So to keep testing relevant, organizations must expand the scope of their testing on their own. "Each time you do a test, you add on additional scope, additional applications," Summers said.
Data dependencies create data recovery issues
With complexity comes interdependencies, posing a whole other set of test challenges. "The environments are getting more and more complex. It's a lot trickier to ensure that every application will come back up the way you want it to," said Bob Laliberte, an analyst at The Enterprise Strategy Group Inc., an IT consulting firm in Milford, Mass.
The concern about being able to bring applications back up and have them work the way they did before either a DR test or an actual event is very real, Summers agreed.
"What worries me is that companies with a hundred applications are building disaster recovery plans for 20% to 30% of them. These plans often call for the rest of the applications to be recovered on a 'best effort' basis --perhaps in a few weeks -- and this is where the issue of data dependencies can become very important," he said. "The problem is you could have one of these applications out there that seems unimportant but maybe it has an important link to a mission-critical application that you don't know about."
These links between applications are known as data dependencies, and classic disaster recovery test planning isn't designed to evaluate these underlying data connections. The best way to keep track of data dependencies is to keep up with application configuration changes in real time. "Every time there's a change, you get an email," Laliberte said.
This sort of product has been available in the storage arena for a while, he noted. But there are also emerging products aimed at tracking system changes that will affect a disaster recovery test. These "dashboard" monitoring tools use remote procedure calls to monitor applications for configuration changes. These configuration changes are then documented for disaster recovery purposes.
For large companies such as Northeast Utilities, an electricity power supplier with 1.7 million customers in New England, this type of tool has been quite valuable. The company has 6,000 employees, including an IT force of 300.
The environments are getting more and more complex. It's a lot trickier to ensure that every application will come back up the way you want it to.
Bob Laliberter, analyst, The Enterprise Strategy Group Inc.
"We began to accelerate our disaster recovery planning in the late 1990s, and we began by getting all of our processes in line -- putting the infrastructure in place to do disaster recovery," said Ed Goldberg, business continuity and disaster recovery coordinator at Berlin, Conn.-based Northeast Utilities. "We realized as we went that there were quite a few things that we can't test. We can't take this down, because if we do it affects this and it affects that, too.
"Or maybe you've changed something on a database that no longer relates to disaster recovery. It works in the production environment, but that data is no longer related correctly to the disaster recovery system," Goldberg added.
To keep up with changes, the company uses Continuity Software Inc.'s RecoverGuard product, licensed as SaaS, to track system changes that might trip up disaster recovery testing. "Once a month we get a report and every night they check to see if anything has changed that might be critical. We can either tell them, 'We didn't know that and we care,' or 'We don't care and just ignore that,'" Goldberg said.
Keeping up with application configuration changes for disaster recovery testing is definitely a step in the right direction, Laliberte said. "Continuity [Software] is somewhat unique in this space," he added, though there are similar products that concentrate on specific storage-focused change monitoring. Symantec Corp.'s Fire Drill storage backup change monitoring software is one example. "There's definitely a lot of attention being paid to this [product] space. The bottom line is, how do you protect your investment?" he said.
Let us know what you think about the story; email: Sarah Varney, Technology Editor