Roadmap to a Disaster Recovery and Business Continuity Plan

Midmarket CIOs are as likely as not to have overlooked drawing up a business continuity plan. Before it's too late, get started with these six steps.

Almost half of midmarket firms have poor disaster recovery plans, according to one estimate. Follow these steps...

to be sure you're not one of them.

The House of LaRose, a beer distributor in Brecksville, Ohio, should have been a festive place on New Year's Eve a few years ago. The midmarket, family-owned company had just moved into a new $30 million, 300,000-square-foot facility, where a state-of-the-art warehouse management system ran on an Oracle 10g database, helping a fleet of trucks deliver tens of thousands of cases of beer a day across the state, 12 million cases a year.

Then the motherboard on the company's Novell NetWare system burned out, triggering a cascade of server failures that brought business to a halt. LaRose had one of the more sophisticated IT environments among beverage distributors in the U.S. -- but it didn't have a disaster recovery (DR) or business continuity (BC) plan.

IT administrator Dan Brinegar called Novell, which said it would take three days to get a new motherboard. So he called around Ohio until he found one he could borrow, allowing the company to get back online in a matter of hours. "It was really a wake-up call," Brinegar says. "The whole company comes to a screeching halt if the ERP system goes down."

Disaster recovery used to be reserved for large enterprises, but in the increasingly 24/7 business world, more and more midmarket firms are finding they can't afford not to keep things running. And high-availability requirements are growing all the time. Forrester Research Inc. in Cambridge, Mass., estimates that enterprises have doubled the number of mission-critical database applications in the past five years. Yet many firms remained poorly prepared. A Gartner Inc. survey found that almost half of midmarket and large enterprises had relatively weak DR plans.

Editor's Note

To view our complete multimedia package, visit our disaster recovery and business continuity supercast.

"Your business is only as good as the integrity of your data," says Ricky Bajaj, a business division manager who is in charge of disaster recovery at Telco Solutions, a midsized electronics manufacturer in Franklin, Tenn. Telco recently overhauled its BC/DR strategy, deciding to outsource the function, a common solution in the middle market.

More and more companies don't have a choice. Publicly traded companies face Sarbanes-Oxley Act mandates for data retention, while private companies in industries as different as the wine business and finance must meet government regulations for record-keeping and service continuity.

"Even when regulations don't explicitly mandate BC/DR planning, regulatory bodies require that companies comply with their audits and requests for information no matter what," Forrester analysts Rüdiger Krojnewski and Bill Nagel noted in a recent report called "Planning Your Next Disaster." "Failure to do so due to a disaster or other business disruption is not an excuse -- and regulators can add insult to injury by levying fines for noncompliance exposed via the inability to recover from a disaster."

While DR planning may be more challenging for resource-strapped midmarket businesses than large enterprises, there still are basic ways to ensure a timely recovery and maximal continuity. Here's a guide to crafting a DR/BC plan.

Step 1: Assess

The first step is to conduct a detailed review of the vulnerabilities that IT and the overall business face by performing a business impact analysis (BIA). This should cover what threats are likely (power outage, natural disaster, terrorism) and the possible consequences in terms of lost revenue, productivity and reputation. Identifying the threats likely to prevail in a particular geographic region will help determine data center location, data center site separation, and the most cost-effective technologies for DR. Also establish recovery time objectives (RTOs), or the time to full resumption, and recovery point objectives (RPOs), which specify the amount of data loss acceptable in terms of minutes or hours.

Gartner analyst Donna Scott says a BIA needs to be a joint project between business and IT. "You have to understand what the most critical things in your business are that you need to protect," she says. "You really have to understand your business. IT can't do it by itself."

A BIA, she notes, is different from a security assessment. A BIA is focused on assessing the criticality of business processes and the applications used within them, as well as the impact when the applications and infrastructure are not available for varying periods of time.

Sometimes it takes a real disaster to wake up a company. When Hurricane Rita was headed for Houston a couple of years ago, the LifeGift Organ Donation Center implemented its DR plan: The IT staff loaded equipment on a truck and drove it to Dallas. Management realized it needed a better plan and contracted with CompuCom, a Dallas-based IT services firm, to take over DR planning.

CompuCom solutions architect Charley Ballmer created a BIA for LifeGift, assessing hurricanes and flooding as the most likely disasters, followed by a terrorist attack on the local petroleum industry. The BIA ranked LifeGift's most critical business processes as organ tracking and patient communication, establishing a 15-minute RTO for those apps, while accounting was given a 24-hour failover period.

"When they went into DR mode [during Rita] their plans were for naught," Ballmer says, noting that a bigger disaster could have had fatal consequences for the organization. "They hadn't made the investment in hardware and strategy."

Now CompuCom runs LifeGift's IT environment from its own operations center, fully testing the DR failover systems every six months. LifeGift, Ballmer adds, is prepared to weather a major storm with only minimal disruptions.

Step 2: Define

The second step is to establish realistic and specific business recovery objectives. RTO and RPO requirements need to be defined in terms of risk/reward. That is, how much protection does the company really need, and how much is it willing to pay for?

CIOs should adopt a structured, formal approach, drawing on published methodologies -- including IT Infrastructure Library, COBIT and ISO standards 17799 and 27001 -- that define risk, threats and controls.

But actually defining how robust a DR plan a company needs is a business decision. Forrester's Krojnewski says executives sometimes ask for unrealistic or overly expensive continuity. And IT needs to provide a reality check.

"The key problem is people from IT and business don't talk the same language," he says. "The role of the challenger has to be an internal IT person. But the scope is solely a business decision."

CIOs should document business processes arranged by tier of criticality, usually on a scale of one to five (see "How to Classify Assets for Recovery"). Customer- and partner-facing apps tend to fall into tier one importance, while back-office operations are deemed less critical. But some apps may need to be moved up in importance because of the way they interact with mission-critical systems.

When Chris Formes became IT manager at Brookfield Homes, the $888 million public company didn't have a DR plan, so he hired a contractor to perform a threat assessment and design a recovery strategy. At the highest level of failover continuity, the plan would have required a $200,000 hardware investment and $90,000 the first year in service costs.

"I felt it was important to have a plan in place," he says. "But when I looked at the numbers and sat down with the company president, it didn't make a lot of sense for the kind of work we do. In the event of a disaster like an earthquake, we're not doing any work anyway."

Instead, the Del Mar, Calif.-based company opted to go with an eight-hour RTO and tape backup, hiring Arizona-based Insight Enterprises to deploy a tape system to take a snapshot of the storage area network every hour. The company also rolled out Mimosa Systems' NearPoint email solution to archive its Microsoft Exchange server data. The cost of business continuity dropped to $1,200 a month.

"DR is simply an insurance policy," says Formes. "It's risk and reward. In our case it wasn't worth the risk [for total business continuity]. We wouldn't lose any revenue. We'd be hampered, but the process of building homes wouldn't stop."

Step 3: Ensure buy-in

Getting money for something you hope you never have to use can be difficult. DR can be expensive and doesn't generate revenue. ROI can be hard to gauge.

IT should advise and execute, but overall responsibility for DR should be vested in line-of-business owners. CIOs should make a case for DR investment so that the business owners can go after the funding. Finding metrics to measure DR can be hard, but IT should at least measure the effective-ness of any solution during a test.

Krojnewski suggests building standard BC/DR service classes and measuring the disaster recovery spending per service as a percentage of operational costs. Evaluate the technologies in each class every other year.

"Buy-in has become easier at those firms where the business executives know that they're dependent on IT," says Krojnewski. "But some businesses just don't have a good relationship with IT, and it's difficult to get sponsorship for a DR budget. Where IT is looked at as a cost center, it can be hard to get business executives interested in DR unless they're pushed by regulatory agencies."

At the House of LaRose, Brinegar went to management after the New Year's Eve outage and campaigned for a DR plan. "We were down for what could have been a couple of days," says Brinegar. "It was only six hours, but [management] gave me a check so I could do what I needed to do. My position was, 'Look at what could have happened and how many millions we could have lost.' It was the first time I was told 'Buy something' before I even left the room. Usually it's months of debating and no decision. Management understood. We were very vulnerable."

Step 4: Outsource?

Once the plan is in place, one of the most crucial decisions is whether IT has the expertise and resources to implement the project or if outside help is needed. Forrester reports that most of the enterprises surveyed found that implementing a DR plan required more work than expected. Gartner's Scott says a quarter to a third of large enterprises outsource DR, while three-quarters of midmarket firms do.

"It may not make sense to invest in a full-blown plan," says Scott. "When does the risk become high enough? For the midmarket, outsourcing is very appealing. You can reduce overall costs and still protect the business."

If the decision is to seek help, consider using systems integrators or complete outsourcing services. Some firms deal with preferred partners, while others use an RFI/RFP process.

At Telco Solutions, Bajaj says the company's decision to switch from doing its own tape backup to outsourcing DR was obvious. The company hired Toronto-based managed service provider Asigra.

"It was a no-brainer," Bajaj says. "We're a manufacturer. Our competency is not IT. We want to do what we're good at. I don't want to hire guys, paying $60,000-$70,000, to manage this stuff when I can outsource it for 50% of the cost. I don't want to depend on an employee; I'd rather rely on a company."

Step 5: Update

Business changes constantly, and so must any DR/BC plan. Periodic reviews are necessary to make sure changes to the IT infrastructure don't make plans moot. Are new business lines covered by the plan? Have discontinued services been removed from the DR scenario?

Business leaders and IT need to consult regularly to keep things current. They should review existing technology annually and new technology whenever it is introduced. Reassess threats on a periodic basis, both what they are and what their potential impact is.

Disaster recovery should be considered part of change management, Krojnewski says.

At LifeGift, updating the DR plan as technology changes is easy; the company was pleased enough with CompuCom that it outsourced all of its IT to the firm.

"Disaster recovery turned out to be only 7% or 8% of the planned investment," says Ballmer, who has effectively become the company's CIO. "We did it for a lot less money than they [business executives] expected, and we cut the recovery time objectives."

Step 6: Test

Testing carries some risk, requiring scheduled downtime and potential business consequences. But not testing is even riskier. Effective DR requires full testing once a year and after any changes have been made that affect the plan. Make sure the failover scenario really works before you need it to.

It's not just the technology but also the process that needs a dry run so that everybody knows what to do in the event of an actual disaster. "Companies don't test enough," says Scott. "If you don't test, you don't have a plan; you have a false sense of security."

How to classify assets for recovery

Michael Ybarra is a contributing writer for Write to him at [email protected].

Next Steps

Learn how to create business recovery plan without management approval

Dig Deeper on Small-business IT strategy