Skip to content

How to Solve the Big Challenge of Big Data: Part 2

by on June 30, 2011

Recently, I talked about how Big Data is a challenge for companies to manage and store simply because it’s, well, big. We’re talking millions or billions of files, not thousands. Terabytes or petabytes of data, not gigabytes. So, how do you manage such a large volume of data in a way that lets you extract the full value from it?

Big_data
The good news is that not all of your Big Data has to be backed up. Information residing on systems that are already backed up can be excluded from a Big Data backup. The bad news? Unique data—data that is fed into the environment through sensors and other devices. Because this data can’t be regenerated, copies of it are often made within the Big Data environment to ensure that it can be safely analyzed. This means dealing with lots of redundant information.

Combining disk and tape into a single active archive can ensure the security of all of these ones and zeroes while controlling capacity costs. Disk is well suited for the small-file transfers and deduplication involved in Big Data backups. But with the advent of LTO-5—and soon, LTO-6—tape will play an increasingly important role in managing Big Data. Primary disk can be used to take in and store data requiring ready access or analysis, optimized disk can be used for storing near-term data that is not currently being analyzed, and tape can be used for storing long-term, infrequently accessed data. Tape also protects Big Data and streamlines its management processes—by copying data to disk and tape simultaneously as the data pours in, you can avoid the arduous task of conducting nightly or weekly backups. Then, effective tape management systems step in to organize backups and facilitate easy access to archived data.

In addition to the chore of backing up and securely storing Big Data, businesses are struggling to find a platform that is not only fast enough to let them efficiently mine it, but also reliable and cost effective. These are conflicting requirements in the world of storage. Given that we are at once overwhelmed by and addicted to data, it’s a conflict that will need to be resolved with innovation.

Leave a Comment

Leave a comment