News Stay informed about the latest enterprise technology news and product updates.

Deduplicate, compress and defray costs of data storage management

Enterprises are looking for cost-effective data storage management techniques. Deduplication is the most popular solution, but not always the best. Learn more about some options.

As we enter the age of metadata -- in which reams of information are collected about application, network and business performance -- corporate data growth is measured in double digits. Add to that the increasing amount of unstructured data -- audio, image and video files -- and it's no wonder that CIOs are looking for cost-effective data storage management.

It isn't just about backup any more, as storage requirements drive demand for software tools and techniques that enable businesses to parse out what's really worth saving. Alongside such tried-and-true data storage management methods as archiving and compression, data deduplication has become an industry darling, and using snapshot technology for continuous data protection (CDP) is coming on strong.

"Backup is all about keeping a copy for a period of time in case you need it," said Lauren Whitehouse, an analyst at The Enterprise Strategy Group Inc. in Milford, Mass. "How you make that copy is up for grabs right now."

Deduplication for data storage management deletes copies

Of course, darlings always have their detractors. "The industry is drunk on dedupe," said Greg Schulz, founder and senior adviser to The Server and StorageIO Group, an IT advisory consultancy in Stillwater, Minn. "It's not a bad thing -- I'm a big dedupe supporter. But when you take a step back, dedupe becomes one of the tools in the toolbox," he said: When used with other tools, it can achieve even greater savings in storage and backup costs.

"CIOs want to look at the bigger picture, but what they're being told by network admins is, 'we have to dedupe,'" said Schulz, the author of Resilient Storage Networks and The Green and Virtual Data Center.

For one thing, Schulz said, "not all data dedupes. It needs to be a recurring, repeating pattern of data. Deduplication is optimized for text; video and audio files generally don't dedupe well." MP3 and MP4 files, for example, are compressed.

Compression: A turtle to dedupe's hare?

Deduplication for data storage management is hot because of its 10-1 data reduction ratio -- so, if you have 100 terabytes of data, you'll back up 10 terabytes, whereas "compression is a boring 2-to-1 ratio," Schulz said. However, using compression, even a small improvement -- like 50 TB -- on a recurring basis has a big benefit.

Traditional Zip files (offline, compressed files) require complete data decompression before they're modified. Online compression, however, allows for reading or writing to a compressed file without full decompression or delay, Schulz said. Online compression is well suited to databases, online transaction processing, email, home directories, websites and video streaming, he said.

Archiving for data storage management: A strategy that works

Along with such technologies as dedupe and online compression, plain old archiving is still an excellent method for reducing a large data footprint.

CIOs want to look at the bigger picture, but what they're being told by network admins is "We have to dedupe."

Greg Schulz, founder and senior adviser, The Server and StorageIO Group

"We see archiving as a major benefit to reducing the load," said Michael Osterman, founder of messaging system consultancy Osterman Research Inc. in Black Diamond, Wash. "Archiving is great for single-use storage."

IT departments should "start archiving data to free up space. Put it on the shelf," Schulz advised.

A proper archiving strategy begins with an understanding of what data exists. Next, applicable rules and policies are required to determine what can be archived, for how long and in how many copies it may be archived, and how it may be retired or deleted. Archiving requires a combination of hardware, software and people to implement business rules, Schulz said.

Continuous data protection in a 24/7 world

Backup software agents traditionally have communicated with servers to note changes in a file system. It's a tedious process that involves policy checking, ESG's Whitehouse said, and it's constrained by the window of time in which to execute the backup. "We live in a 24/7 world," she said. Companies need to make sure backup doesn't interrupt production, so they're doing it in small slivers of time during the night. "They're trying to control time, even as double-digit data growth forces them to back up more."

Technologies suited for virtualized environments are eliminating the need for laborious backups, however. CDP, for example, involves point-in-time capture for a rolling series of snapshots or videotape of the data. "It alleviates the backup window because with CDP, you're always backing up," Whitehouse said. In this instance, disaster recovery becomes business continuity.

Let us know what you think about the story; email Laura Smith, Features Writer.

Dig Deeper on Enterprise data storage management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.