Aug 232012
 
Big data takes our online, shopping and social media use it is the business challenge for our time

“How much server space do companies like Google, Amazon, or YouTube, or for that matter Hotmail and Facebook need to run their sites?” is the question I’ve been asked to answer on ABC Radio National Drive this evening.

This isn’t a simple question to answer as the details of data storage are kept secret by most online services.

Figuring out how much data is saved in computer systems is a daunting task in itself and in 2011 scientists estimated there were 295 exabytes stored on the Internet, desktop hard drives, tape backup and other systems in 2007.

An exabyte is the equivalent of 50,000 years worth of DVD video, a typical new computer comes with a terabyte hard drive so one exabyte is the equivalent of a million new computers.

The numbers when looking at this topic are so great that petabytes are probably the best way of measuring data, a thousand of these make up an exabyte. A petabyte is the equivalent to filling up the hard drives of a thousand new computers.

Given cloud computing and data centres have grown exponentially since 2007, it’s possible that number has doubled in the last five years.

In 2009 it was reported Google was planning to have ten million servers and an exabyte of information. It’s almost certain that point has been passed, particularly given the volume of data being uploaded to YouTube which alone has 72 hours worth of video uploaded every minute.

Facebook is struggling with similar growth and it’s reported that the social media service is having to rewrite its database. Last year it was reported Facebook users were uploading six billion photos a month and at the time of the float on the US stock market the company claimed to have over a 100 petabytes of photos and video.

According to one of Microsoft’s blogs, Hotmail has over a billion mailboxes and “hundreds of petabytes of data”.

For Amazon details are harder to find, in June 2012 Amazon’s founder Jeff Bezos announced their S3 cloud storage service was now hosting a billion ‘objects’. If we assume the ‘objects’ – which could be anything from a picture to a database running on Amazon’s service – have an average size of a megabyte then that’s a exabyte of storage.

The amount of storage is only one part of the equation, we have to be able to do something with the data we’ve collected so we also have to look at processing power. This comes down to the number of computer chips or CPUs – Central Processing Units – being used to crunch the information.

Probably the most impressive data cruncher of all is the Google search engine that processes phenomenal amounts of data every time somebody does a search on the web. Google have put together an infographic that illustrates how they manage to answer over a billion queries a day in an average time of less than quarter of a second.

Google is reported to own 2% of the world’s servers and they are very secretive about the numbers, estimates based on power usage in 2011 put the number of servers the company uses at around 900,000. Given Google invests about 2.5 billion US dollars a year on new data centres, it’s safe to say they have probably passed the one million mark.

How much electricity all of this equipment uses is a valid question. According to Jonathan Koomey of Stanford University, US data centres use around 2% of the nation’s power supply and globally these facilities use around 1.5%.

The numbers involved in answering the question of how much data is stored by web services are mind boggling and they are growing exponentially. One of the problems with researching a topic like this is how quickly the source data becomes outdated.

It’s easy to overlook the complexity and size of the technologies that run social media, cloud computing or web searches. Asking questions on how these services work is essential to understanding the things we now take for granted.

  8 Responses to “How much server space do Internet companies need to run their sites?”

  1. As I got older, there were times when I have moved house, upsized, downsized for my changing needs. Through this process, I learnt to let go of the physical load of memories and ‘junk’ that weighed me down. So instead of coining the next “wecan’tfit anythingmoreinhere’byte term, when will the concept of culling be applied to the digital world? When do we stop packing everything into the shed that we all share now(ie. the internet)? Let’s stop, breathe and clean out the junk that we will never look at again and relieve us and the people around of us of this burden. When will we stop building bigger places in our physical, mental, emotional and digital states to store more stuff rather than cleaning out our cupboards, thoughts, lives and the internet to make our lives cleaner and more simple?

    • Kim, I don’t think we’ll see that culling soon. If anything it’s going to get worse as digital storage increases. The real challenge, as much for individuals as businesses, is in managing these masses of data.

  2. On the other hand, finding an obscure video on Youtube that reminds you of something you experienced in your childhood 20 years ago is an amazing feeling. The Internet is like an unlimited Library of Alexandria. For almost every piece of information there is somebody somewhere who will be extremely happy that it was preserved. After all, if we start judging what information is “worthy” of preserving, who gets to decide what’s “worthy” information and what is not? Is your opinion that something is “junk” more valuable than somebody who will cry when they see it twenty years from now?

  3. […] engines of our informational and social world (Google, Facebook, etc.) could currently or soon have at least exabytes (10006) of information on us. This means that we have to know much more about them: secrecy in corporations as powerful as these […]

  4. […] engines of our informational and social world (Google, Facebook, etc.) could currently or soon have at least exabytes (10006) of information on us. This means that we have to know much more about them: secrecy in corporations as powerful as these […]

  5. […] to pour their data into some bucket or other. The cloud, of course, is perfect for this. All that ridiculous amount of disk storage available in the monster data centres of the […]

  6. The amount of computing power is mind-blowing. My website would be a speck of dust right next to Google’s datacenters.

Leave a Reply

%d bloggers like this: