IT organizations are responsible for storing critical data for the company. But it gets even harder when you have to keep track of data on a large or mixed environment that includes portable devices, like laptops and PDAs and remote offices. A backup plan is not enough. Staffing, resource allocation and training are just some of the important issues CIOs need to consider in advance. This expert podcast will offer best practices CIOs should know to manage critical data and create a strategic backup and recovery plan in a mixed and remote environment.
SPEAKER: W. Curtis Preston, vice president, data protection services, GlassHouse Technologies Inc.
BIOGRAPHY: Preston is the author of Using SANs and NAS and a new book, Practical Backup and Recovery. He has designed systems for environments ranging from backup systems for small businesses to enterprise storage systems for Fortune 100 companies. Widely recognized as the thought leader in the field of backup, Preston is the former president and CEO of The Storage Group. He joined GlassHouse in July 2004.
Read the full transcript from this podcast below:
Karen Guglielmo: Hello. My name is Karen Guglielmo, the editor of SearchCIO.com, and I'd like to welcome you to today's expert podcast on "Creating a Strategic Backup and Recovery Plan for Remote Employees."
I'm joined today by W. Curtis Preston, vice president of data protection services, with Glasshouse Technologies Inc. He's the author of Using SANs and NAS and the recently released book Backup & Recovery.
Curtis has been designing storage systems for more than a decade and has designed systems for environments ranging from backup systems for small businesses to enterprise storage systems for Fortune 100 companies. Widely recognized as the thought leader in the field of backup and the former president and CEO of The Storage Group, he joined Glasshouse in July, 2004.
Today, Curtis is joining us to talk about creating a strategic backup and recovery planfor remote employees.
As I did mention earlier, we're here today to talk about backup and recovery for remote employees. Curtis will spend the next 10 to 15 minutes offering advice on how to create a strategic backup and recovery plan for your remote employees. Curtis.
W. Curtis Preston: Thanks. So, the first thing I just want to do is define what I mean when I talk about "remote employees." There are two types. There are the obvious employees, where they're actually using a laptop or a PC, and that PC is in no way connected to any other corporate infrastructure. They're basically using the Internet and perhaps a virtual private network to be able to access the corporate network, but they're not in any sort of data center. So, that's the first type of remote employee.
Then, the second type of remote employee is, actually, typically, servers or computers that are in some sort of remote office, but that office is considered remote by the rest of the IT infrastructure. Quite often these offices are as small as one server or perhaps not even a server at all, just, again, a computer or laptop to just happens to be part of the corporate network, but it's really remote from the rest of the data center.
The problem with all of these types of users is that they're not within that typical data center, where we're typically backing up the data. Most traditional backup systems, even if were talking about an incremental forever system, it backs up too much data to be able to backup that remote user across the wire or across the LAN or across the Internet.
So, the first thing - and there are some solutions that are out today, some new software that's come out in the last several years that can handle these users and can protect them in a reasonable way, and that's one of the reasons we're having this conversation. So, let's first talk about what we need to do before we start going down that road, and the first thing that I'd say is that we have to define what the requirements are, the recovery requirements are, for these users, because if we begin down this road without those requirements, we're going to be presented with a number of choices, number of design choices and software choices, that we're not going to know which way to go if we don't have those requirements.
So, we need to decide, in the event that this user's data, the user's laptop or the server in a remote site, has lost its disk drive, what recovery time objective, or RTO, are we expecting to have to be able to restore this data? Is it acceptable that we just restore it, period?
Because, typically, this data isn't being backed up at all, and just having it being able to be restored, period, is considered a joy by many of these people, and if it takes you two to three days, they don't care, they just want to get the data back. So, that's acceptable for many applications.
Or, perhaps, is this remote office the type of remote office that, if that computer or server is down for a day or more, we're actually losing revenue, whether this is the salesperson's laptop or servers that are actually driving some part of your infrastructure, that if those servers are down, then your infrastructure is actually going to not be doing anything, and, as a result, your business is halted. So, you'll have to take a look at your business requirements, what the servers do and how they accomplish those requirements, and decide, "Well, for the server, we need an RTO of eight hours. For that server, three days is just fine," or however long you decide.
So, once you make that decision, then we take a look at the various ways to accomplish that. So, without looking at the mobile users - let's just talk about the remote sites - there are ways to do this, traditional ways, where we can put backup servers, tape drives and a whole bunch of infrastructure out there to back up these remote sites, and then somebody has to manage that tape.
The servers can be managed remotely, and then you can actually contract with an off-site media vaulting company, who will actually come out and pick up those tapes for you. This would be the traditional answer to that solution, and there are a lot of negative things that come with that traditional answer, and, of course, that traditional answer doesn't answer what to do with these mobile users, because their data can't be connected or backed up in that way.
So, there is a new area of technology that is being used to handle these remote users. There's this concept called "deduplication," and, without going into too much detail, let's just say it's a technology that can identify that data has already been backed up, and, if it's already been backed up, then we don't need to back it up again, and it can backup data at a continuous incremental level and at a level that actually looks inside the files and actually notices that a portion of the file has been backed up, but another portion of the file has been changed in the last 24 hours, since the last backup, and it's in just that portion.
What this all translates to is, we can ask the backup very large amounts of data with these disk drives that are available, and if you look at . . . It's not uncommon to get a 100 GB disk drive in a laptop these days, and our ability to fill out those disk drives never seems to go away, either.
Then, of course, the servers at the remote site, we can actually backup hundreds of gigabytes if not even in the terabyte land, incrementally, every day, on a block level, so that we can protect a pretty good amount of data, from a backup perspective, back to the central site, so that we now have our backups in a central site, and the types of products, they're called deduplication - so, if you want to talk to somebody about the types of products, you say you're interested in remote backup products or deduplication backup products - and the idea is that they can back all this data up to a remote site, using this modern technology that reduces the amount of data that needs to be transferred across the wire, anywhere from 20 to 50 to 100 to 1. So, the amount of data that needs to be transferred across the wire is very, very small.
Then, what you then have to go back to is that original requirement that we talked about, with the recovery time objective, and say, "Based on this recovery time objective, what type of infrastructure do we need to place at the remote site, in order to meet this RTO?" and the type of infrastructure that you're going to put is going to range, anywhere from absolutely no infrastructure, simply placing a piece of software on the computer at the remote site, to installing an actual backup server at that remote site, in order to provide faster recoveries, if you have an RTO of something less than a day.
So, basically, you have to take a look at those requirements, take a look at this software, take a look at the various pieces that you can put together to do the various levels of RTO, and then make the decision, based on that.
The important thing to understand is that the software is here, the software is real. It's available from major backup software companies that you would be familiar with, and this stuff works. It's being used by corporations around the world, and it's just a matter, at this point, of starting to use this technology and then assign the appropriate hardware or software to your environment, based on the RTO's that you have decided upon.
So, for example, on one end of the spectrum, if you got a user that has an RTO of 24 to 36 hours or maybe 36 or even more hours, it would be perfectly reasonable, since you've got all the data back at your central site, to recover that user's disk drive and then FedEx that user's disk drive to them, or, if it's a laptop FedEx an entire laptop, or whatever, that would be reasonable many types of recoveries.
If that's not acceptable, you would have to do the restore across the wire. That would be the next level of service. And if you can do the restore across the wire, then, of course, since we now have to send all of the data back, because there's a disk drive that's been lost, or something like that, then the user can - we can only restore a certain amount of data. The wire is only so big. We have laws of physics involved, here. So, we can only restore so many gigabytes, based on the size of the network pipe that this user has available and that we have available to that user. So, that can then restore a certain amount of data remotely.
Then, if you get too much data to restore across the wire in a reasonable amount of time, or you have a pipe that is very small, or you have a very aggressive RTO, such as a few hours, then you can install remote data at that remote site, in order to backup that user.
So, in summary, again, what we have to do is we have to start with those initial requirements. The number one mistake I see people making is beginning this process without having first decided what their recovery requirements would be of those remote sites. So, first, start on those requirements. Then, investigate the various different packages that can be made available to do this.
They come in two flavors, that I forgot to mention before, two flavors. One is basically backup software that uses deduplication, and then there's also hardware, such as - there's this concept called "the virtual tape library." Sometimes, in larger infrastructures, this might be more appropriate.
So, you investigate these various options and then apply those original recovery requirements that we mentioned to this overall design, to then decide how much infrastructure that you have to make along the way, in order to meet these requirements.
Karen Guglielmo: Okay. And, on that note, that concludes today's podcast. Thanks again to W. Curtis Preston for speaking with us today, and thank you all for listening. Have a great day.
This was first published in February 2008