IBM & Google Prepare for Digital Overload

Researchers and workers in technical fields like bio-technology, astronomy and computer science will soon be overwhelmed with information.  Who’s to blame for this upcoming data overload?  Technical advances like better telescopes and genome sequences to faster computers and bigger hard drives are all to blame.

While consumers are just starting to understand the importance of home external hard drives that are capable of storing a terabyte of data, computer scientists will soon have to deal with data sets thousands of times as large and will continue to grow even larger.  For example, Facebook uses more than one petabyte of storage space to manage its users’ forty billion photos.  Google processes about twenty times that amount of information every single day just running data analysis jobs.  (One terabyte is equal to one thousand gigabytes which could store about one thousand copies of the Encyclopedia Britannica.  One petabyte is one thousand times as large as a terabyte and could store about 500 billion pages of text.)

Jimmy Lin, associate professor at the University of Maryland, says, “It sounds like science fiction, but soon enough, you’ll hand a machine a strand of hair and a DNA sequence will come out on the other side.”  Technology companies like IBM and Google believe that we are going to run into the problem of having too much information and not being able to do anything with it.

Two years ago, IBM and Google gave university students access to some of the largest computers on the planet.  The computers had software that Internet companies use to perform data analysis jobs.  Prior to IBM and Google providing these technical services, students had to use modest computing systems to support their studies.  The machines that students were using weren’t capable of processing enough data to really challenge and train their minds to process the mega-scale problems of tomorrow.

This year, the National Science Foundation, a federal government agency, issued $5 million between 14 universities that wanted to teach students how to handle big data questions.  For example, the University of Washington used the technology to study the evolution of galaxies.  The students interpreted the data gathered by large telescopes that inch their way across the sky taking pictures of various objects.  The students were using the Sloan Digital Sky Survey which is the largest public database with about eighty terabytes of data.  Now thanks to IBM and Google, they’re now using a new system called the Large Synoptic Survey Telescope which takes detailed images of larger pieces of the sky and produces about thirty terabytes of data each night.

By donating their computer softwares to the universities, IBM and Google hope to train a new breed of engineers and scientists to handle large sets of data.  IBM is even looking for big data experts who can complement its consulting areas like healthcare and financial services.

No Comments

Write comment - RSS Comments

Write comment

Search by State