For all the time that storage managers spend worrying about safeguarding their organization's data, many of them know little about the objects of their obsession. They usually know how much total data there is -- too much -- and how much available storage capacity they have -- never enough. But they often don't know which data is critical, which is less important and which belongs in the trash.
Helping storage managers recognize those differences and do something about them is the idea behind several new products now reaching the market. The latest arrival, Kazeon Systems' Information Server IS1200, started shipping last month.
It joins about a half-dozen other data-classification tools now on the market.
Such tools would not have attracted much interest a few years ago when enterprise storage infrastructures were still mostly one-horse towns. All data, regardless of value, went in the one big, expensive storage bucket.
But now storage managers can more easily build tiered storage systems of varying cost and performance, thanks to products like Serial Advanced Technology Attachment-based disk arrays. Those systems are less expensive than traditional enterprise arrays. With new storage networking gear, they can move data around more efficiently and across greater distances than ever before.
Those new classification tools can help managers sort their data according to its value to the organization, among other criteria. Then managers can place the data in an appropriate storage tier.
"We see data classification as being the foundation for a range of applications that address governance, information discovery and information life cycle management," said Troy Toman, vice president of marketing at Kazeon. "This is what we think is driving people to want more visibility into their content."
Consider the problem
Most storage managers know little about most of their electronic content. Industry estimates peg the portion of unstructured information that a typical organization owns at about 80 percent of the total. Unlike the structured data that goes into a single database management system, unstructured information is stored in a variety of file formats, such as e-mail messages, word processing documents, spreadsheets, electronic images, presentations and so on.
Because practically any employee can create this unstructured content using a wide range of applications, storage managers typically deal with the glut of new data by essentially not dealing with it at all. They simply dump all the files into the organization's primary storage systems and give all of it the same five-star treatment. An important
e-mail message from one executive to another and an MP3 music file downloaded by a summer intern have the same safeguards -- regular backup and long-term retention.