Trends in high availability and fault tolerance

In this podcast, hear how high-availability and fault-tolerance solutions are helping midmarket companies achieve greater availability and performance on mission-critical systems.

Companies that are concerned with performance, availability and reliability are investing in high-availability clusters and complex fault-tolerance servers, according to a recent survey from Information Technology Intelligence Corp. (ITIC). The survey also revealed that one out of 10 companies said they need greater than 99.999% availability. To do so, investments in high-availability and fault-tolerance solutions are necessary.

In this podcast, Laura DiDio, author of the survey and founder of ITIC, discusses the highlights from the fall survey and how the focus on availability and reliability is also helping companies better determine ROI and total cost of ownership. In addition, DiDio shares solutions many midmarket users are adopting to address these issues, including high-availability clusters and fault-tolerance servers.

BIOGRAPHY: DiDio is a high-tech industry analyst and consultant, professional writer and former reporter. She is the founder and principal at Information Technology Intelligence Corp. Previously, DiDio spent more than six years at Yankee Group Research Inc., a Boston consultancy, where she held the title of research fellow. She has expertise in a wide range of topics, including virtualization, desktops, server operating systems, OS security, hardware and business intelligence.

Play now:
Download for later:

Trends in high availability and fault tolerance

  • Internet Explorer: Right Click > Save Target As
  • Firefox: Right Click > Save Link As


Hello. My name is Karen Guglielmo, the executive editor for and I'd like to welcome you to today's expert podcast to discuss availability concerns for midmarket CIOs. We'll be discussing the results of a fall 2009 survey by Information Technology Intelligence Corp. on high availability and fault tolerance. Joining us today to review the survey findings is Laura DiDio.

Laura DiDio is a high-tech industry analyst and consultant, a professional writer and a former reporter. She is a principal at Information Technology Intelligence Corp., a company she founded. Prior to this, Laura spent more than six years at Yankee Group, a Boston consultancy, where she held the title of research fellow. She has expertise in a wide range of topics including virtualization, desktop, server operating systems, OS security, hardware and business intelligence.

Welcome, Laura.

DiDio: Hi Karen. It's a pleasure to be here.

As I mentioned earlier, we're here today to talk about availability concerns for midmarket CIOs. We'll spend the next 10-plus minutes discussing the results of the survey and trends for 2010.

So, Laura, can you first give us an overview of what your survey was about and highlights from the findings?
DiDio: Sure. No. 1, we wanted to track general industry trends on TCO and ROI, and in order to really determine total cost of ownership and return on investment we have to ask specific questions relating to availability, reliability of hardware systems and software systems, how much uptime customers are getting. Now this is particularly important in light of the ongoing economic downturn, which is now two years old. So we queried customers in the impact of the downturn on their IT departments and daily operations. So what we get out of this, is that it's more important than ever to have reliable systems when you've had budget cuts, staff cuts, cuts in training and just about everything else.

What are some trends you're seeing in midmarket technology adoption as it relates to availability and fault tolerance?
DiDio: Well, not surprisingly, the economic downturn has had an adverse impact on 75% -- that's three out of four corporations -- and their IT departments. What we found is that nearly 50% of all of the survey respondents have experienced budget cuts and 42% across the board -- and this goes from the smallest SMBs all the way up to the largest enterprises -- have hiring freezes. Now, specific to the midmarket, these folks are really caught in the middle and when we see companies anywhere from 300 employees up to thousands, the impact of the economic downturn becomes worse. We have seen in this case 64%, nearly two-thirds have experienced hiring freezes and 84% of midmarket and large corporations say that their IT staff has simply responded by picking up the slack and working longer hours.

Now, we also asked the survey respondents to tell us about high availability and fault tolerance. An interesting statistic from this is that, again, in light of the ongoing economic downturn which has seen IT departments be decimated, budgets get slashed, budgets for training and resources getting cut significantly. At the same time, 60% almost -- that's six out of 10 businesses that require 99.999 -- that's four nines or five nines of availability and uptime for their mission-critical line-of-business applications.

And 9% of the respondents, so that's almost one out of 10 companies, say that they need greater than five nines of uptime. So what that means is, no downtime. In other words, you have got to really have bulletproof, bombproof applications and hardware systems. So you know, what do you use? Well one thing you have high-availability clusters or you have the more expensive and more complex fault-tolerance servers.

Now, interestingly, what we found was that corporations, particularly in midsized and large enterprises, say they are willing to pay extra dollars, a premium of $20,000 or more, for both high-ability clusters and fault-tolerance service to guarantee maximum uptime. You can get entry-level pricing. You can get a fault-tolerance server, for example, under $15,000. OK. But a lot of companies say, "You know what? We'll spend $20,000, $25,000, $35,000 or more just to make sure that we have the premium."

Now we saw a 76% majority tell us that the TCO and ROI value of fault-tolerance servers is excellent or very good. By contrast, only 6% of the respondents said that high-availability clusters, which are less expensive and very pervasive as well, said that HA clusters provide excellent PCO/ROI values, while 44% said it was very good. We did find that, again, almost 60% of businesses said that virtualization, which we all know is a very hot technology and will continue to be very strong, that increases the need for fault tolerance. And there's a very good reason for that. We all know that virtualization, for example, helps people consolidate physical space into a single server. You know, you can reduce the number of servers that you have and when optimally configured, reduce the amount of time it takes to manage applications and deploy applications.

However, the other part of that equation when it comes to virtualization is that you now have all of these applications, server-based applications, multiple applications, in one physical server. So if it's not properly configured, or you don't have robust enough hardware, guess what? You know, you can take out four or five applications. So, that's scary. So that means when you're virtualizing your environment, you have to pay close attention to availability and uptime.

What we also found was that high-availability clusters and fault-tolerant servers are used in equal numbers, so it's a 50/50 split and many companies have both. There is some confusion about firms using the terms high availability and fault tolerance interchangeably. But at the end of the day what we found is, again, companies need higher reliability and uptime. Ninety-nine percent uptime used to be the gold standard 10 or 15 years ago in reliability. But if you ask people, "What is 99% uptime?" most people will be astounded to realize that 99% uptime can equate to 87 and a half hours of downtime per server, per year. And that's just not acceptable. When you go up to three nines of uptime, that goes down to about five hours per server annual uptime.

But what's really optimal? Companies say they need the four nines or the five nines of uptime. So those are all interesting trends. And I expect that trend to continue regardless of whether or not we weather the economic downturn and return to a robust economy. Because people just can't afford downtime at all.

Also, we're going to see more virtualization, more cloud computing coming on and that's also going to increase the need for this.

One of the other questions we also asked in support of TCO, ROI and availability was polling customers on how exactly the economic downturn has impacted their IT department. And in this regard we gave people about 10 different options and told them to select all that applied.

By far the most tangible impact has been budget cuts. Only 25% of businesses said that there has been no impact -- that it's been business as usual. And I would even question that number because if we really started talking to people and drilling down, I think we could find that there was some impact.

We did see that 42% across all segments -- SMBs, midmarket and enterprises -- have had hiring freezes. Thirty percent, nearly one third, have delayed new equipment purchases. We had another one third, 33%, delay or cancel operating system or application upgrade. Which in turn, by the way, means that they also delayed hardware refreshes, particularly on the desktop hardware side.

IT training has been reduced or cut in 32% of organizations. We had one quarter of the respondents say their budgets and salaries have remained the same, however, there's no money for anything extra. And an 11% minority did find that they were forced to cut or cancel their service and support contracts -- and that's dangerous. And overall, one third of businesses have said they've had layoffs.

Now when we dug down into that a little more deeply, we did segment the respondents on that question by size. Not surprisingly, the midmarket firms, which I categorized as anything with probably 250 to 300 users and up, saw deeper cuts and more problems. For example, as I said earlier, 64% of midmarket and large enterprise firms have experienced budget cuts, compared to 42% across the overall landscape, which includes the SMBs.

And among midmarket and enterprise firms, only 11% said that there was no impact on their business from the economic downturn. So that's half as many, if you will, as the overarching 25% that said it was no impact whatsoever.

Now we asked folks to say if your firm has cut IT staff and training, how has that impacted daily operations? And again we gave the respondents 10 different options. Again, we saw the biggest response rate among all companies, 47%, said we pick up the slack and work longer hours. Again, that number was nearly doubled to 84% among midmarket and enterprises. So that tells you that not only is a company running lean, mean and in some cases skeletal, but you can imagine the stress levels in these midmarket and high-end enterprises where there is just more work.

Ten percent said there were more configuration errors. We saw 27%, nearly three out of 10 businesses, respond that it takes them more time to complete upgrades of any type -- whether it's hardware, software and infrastructure. And 20% are behind on applying patches and service packs and 20% said that their testing and quality assurance has suffered.

We saw 5% say that it was difficult for them to meet their service-level agreement, both internally to their corporate end users and externally to their business partners and customers. Only 4% said that server reliability has suffered and only 8% said that security is outdated.

Now, by the way, one of the areas in this, in any of the surveys that ITIC has done, that has remained sacrosanct, has been security. We've seen only a 3% of minority of businesses across SMBs, midmarket and enterprises that have said that their security budgets have been cut or have suffered. I think everyone knows they can't afford not to have very, very strong security, polices, procedures and products in place.

One of the other questions that we asked with respect to business and the economic downturn was how companies view IT investments that improve reliability in uptime. And interestingly, 45% said that it was among key selections criteria, and another 17% said it was a very worthwhile investment. Only 15% said it was not considered at all, while another 24% said it was a low priority.

What all of this means, going forward, is that companies have to invest. I think what we've seen in the last three years is that companies have delayed or deferred upgrade for as long as they can. They've cut back, done cost-cutting to minimize impact on their budget, but at a certain point you have to start investing again.

And we're seeing that happen in key areas like virtualization and cloud computing, particularly application and desktop virtualization, and private cloud computing. Those adoption rates will almost double from what we saw in 2009 to the 2010-2011 time frame.

On that note, that does conclude today's podcast. Thanks to Laura DiDio for speaking with us today, and thank you for listening. Have a great day.

Dig Deeper on Small-business IT strategy