Of all the tasks associated with disaster recovery (DR) planning, one of the most important is the classification of a company's applications.
business impact analysis (BIA), which is the foundation to any DR plan. Questions such as "Which applications should be taken down first, considering underlying data dependencies?" and "How many DR dollars should we spend to keep this data safe and/or easily accessible?" must be answered as part of the planning process.
Midmarket companies often use a BIA questionnaire to get started. These are mainly "yes" and "no" checklists that specify the hardware and software in operation and also capture information on who uses which applications. They can be a good starting place, but they have limitations, said Mike Summers, disaster recovery practice manager at Computer Sciences Corp.
"Using BIA questionnaires is a balancing act between art and science. Everybody thinks their application is the most important. We interview them and ask, 'The application went down a few weeks ago and what happened?' The answer is usually quite revealing in that it may not have been that big a deal," he noted.
Whichever applications are tagged as critical, planning is everything, Summers stressed. "You do your planning up front. You don't want to have to think about what applications to bring down in what order. You need a footprint to work from," he said. At most companies, business and IT managers draw up a list of critical applications together, sometimes using a questionnaire as a starting point.
Ideally, IT and business managers have similar views on which applications are truly critical, and many do, according to a 2007 Gartner Inc. study of nearly 600 IT managers and business managers.
In the study, both groups listed email, customer service, back-office, telecommunications, websites, order entry and submission, and customer relationship management (CRM) among the top applications affecting revenue during unplanned interruptions.
In the case of email, 52% of IT department managers considered it among the top five and 48% of business managers pretty much agreed with that assessment. Forty-nine percent of business managers named customer service to the top five, while 44% of their IT brethren perceived its importance similarly. The only real digression occurred with CRM applications: 29% of IT managers put it in their top five, while 40% of the business group surveyed listed it as a top five concern.
The alignment within the study was quite surprising, said John Morency, research director for the disaster recovery practice at Gartner, an IT consultancy based in Stamford, Conn. "Within those critical applications listed by the IT and business groups, it turns out that for 80 to 85% of those apps, the consensus is that they must be restored within 24 hours or less," Morency said.
Midmarket DR: Easier or harder?
Agreement between IT and business managers over which applications are truly critical is great, but there are still the underlying technical issues to consider. In general, disaster recovery planning can be particularly difficult at midmarket companies, said Gil Hecht, CEO at New York-based Continuity Software Inc., which makes DR software for enterprise-sized companies.
"It's nearly impossible for small companies to get disaster recovery right," Hecht said. They may not be equipped to discover the underlying data dependencies that can affect operations between two critical applications. For example, data coming in to a database from a business intelligence (BI) application may increase by a magnitude that necessitates configuration changes to hardware or storage, which might not be reflected right away in a DR plan. Even in a testing scenario, these configuration changes may lengthen restoration times for both the BI app and the database that processes the data.
People do sometimes have misguided ideas about which applications are truly critical, said Haner, echoing CSC's Summers. "Email is a good example. Many people go through daily problems with email, but it has no direct impact on a company's bottom line," Haner noted. "In general, critical apps are those that have the most effect on a company's bottom line."
At Monster, Haner and the company's business managers carefully weighed the costs of having applications out of commission and the direct revenue effects of an outage. "The first half a day, it costs hundreds of thousands of dollars to be completely without systems, but two days later, the effect is exponential and it's costing millions," Haner explained. In light of these costs, Monster's disaster recovery plan calls for system restoration ideally within 12 hours.
The most critical application at Monster is ERP. However, all of the critical applications -- ERP, the warehouse management system and email -- are set up with asynchronous data dependencies. This strategy nicely avoids any underlying data dependency problems. Haner explained the setup: The CRM system depends on customer records in the ERP system. If the CRM system stops working, the data is still pushed to the ERP system separately. When the CRM system comes up again, the ERP system updates it and the data is synchronized. "All of our systems are set up this way," he noted.
The order of restoration is ERP, the warehouse management system and then email and business intelligence. "BI is not part of my disaster recovery plan. It's really used for strategy and forward-thinking goals. It has no immediate impact on revenue," Haner said. He noted that there's too much data being fed through the app to back up anyway. "I've yet to find a good disaster recovery product for business intelligence," he said.
Which applications are considered critical in a disaster recovery context varies greatly from company to company, Summers stressed, adding it can even be seasonal. "We have a whole set of clients for whom it would depend on the time of year [for a disaster to occur]. For one giant school testing client, in September, out of hundreds of applications, they might recover one application set in a certain order. In January, they might have a plan to recover a different set with a longer recovery time," he said. Who knew that even disaster planning could depend on a class schedule….
Let us know what you think about the story; email: Sarah Varney, Technology Editor