Jul 012012

On the eve of the US Independence Day holiday weekend, the last thing you need is a storm taking out your services. Unfortunately a storm across Virginia did exactly that to one of Amazon’s key data centres, taking with it popular social media sites like Instagram and Pinterest along with the Netflix movie service.

Having a key data centre going down and knocking out the services that rely on it surely exposes one of cloud computing’s greatest weakness – or does it?

Last week I spoke to Eran Feigenbaum, Director of Security for Google Apps, who made the point “not all cloud providers are created equal”. The Virginia outage illustrates this.

Netflix, Pinterest and Instagram all made choices to solely rely on one data centre for key parts of their services leaving them exposed should a storm, earthquake or tsunami affect that location.

Eran introduced me to a term I hadn’t heard before – “shared fate zones” – a good example of which would be putting all your servers in Virginia where they can be knocked out by a storm, in California or Japan where an earthquake can disable them or solely around the Indian Ocean where a tsunami like that of 2004 could know them all out.

All the major cloud providers have the facility to spread loads across the globe for exactly this situation. The services affected by the Virginia storm chose not to do this and they eventually were caught out.

Events like this aren’t just an issue with cloud computing, or even technology in general. Storms, earthquakes, fires and many other natural or man made disasters are a fact of life which can disrupt business. If an earthquake hits your town, the question is how quickly can your business and customers get back to normal.

Distributing services is actually the cloud’s strength, it means we’re not tied to one office or location so we can get back to normal a lot quicker than those who have lost everything.

Computer networks being knocked out is nothing new, we’ve seen this plenty of times over the last fifty years as squirrels have chewed cables, technicians have pressed the wrong button or natural disasters have disabled data centres. By spreading the load, cloud computing services should make networks less prone to problems.

A lot of people will say in the next few days that Amazon’s outage illustrates the unreliability of cloud computing, it’s actually the opposite.

Leave a Reply

%d bloggers like this: