Data classification becomes end users' job


Data classification becomes end users' job

Alex Barrett, Trends Editor, Storage

Forget about automated information lifecycle management (ILM). Before one Fortune 500 biosciences firm ventures into ILM or tiered

Continue Reading This Article

Enjoy this article as well as all of our content, including E-Guides, news, tips and more.

By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.

You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

storage, end users will first have to classify all of the files they've generated over the years, says Michael Masterson, information systems architect at the firm. Without a proper data classification, he said, "ILM is putting the cart before the horse.

"IT can't classify those files," Masterson added. "They're not the information owners." To that end,

More on classifying data

Taking the tears out of tiered storage

Identity management tool minimizes threat of insider attacks

the firm is training users on and Abrevity Inc. FileData Manager, which supplies them with an interface for tagging files similar to the way in which Apple Computer Inc.'s iTunes lets you create Playlists. Once data is tagged appropriately, "ILM is downhill," he noted and just a simple question of data movement.

The Abrevity tool is the first one Masterson has seen that can capture a file's context. "It's unique in its ability to capture [file] attributes in a loosely structured way," he said, including the information you can gather about a file based on which folder it's in. "Somebody has already automatically classified that file by putting it in a certain folder." Other data classification tools Masterson considered look at coarse file meta data or rely on extensive keyword indexing.

Abrevity's ability to contextualize files is related to its unique SliceBase data model and not a generic SQL database. "With SQL," Masterson explains, "a designer has to come in and predefine the columns. It doesn't have the flexibility to just go out there and find whatever is there."

Masterson's company has good reason to embark on a data classification effort. As a public company developing medical equipment, the firm is regulated by both the Securities and Exchange Commission, and the Food and Drug Administration. "That's before we get to the classic ILM cost-of-infrastructure concerns," he said.

Luckily, the data volumes Masterson's division needs to classify are relatively small, so they don't need to implement a nearline tier of storage right away. "We have under 10 terabytes of free files, so we have the luxury of perhaps using more disk than we need to," he notes.

Data classification has become a hot topic as of late, and a number of startups are jockeying for attention. Other vendors to offer some form of data classification include Index Engines Inc., Kazeon Systems Inc., Njini, Scentric, and StoredIQ. Long term, analysts expect server and storage vendors to license data classification software directly.

This article originally appeared on

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: