What would you do if the computer screen went dark?

Warrnambool’s phone problems remind businesses of the importance of continuity planning

What would you do if the computer went dark? originally appeared in Smart Company on November 29, 2012.

One of the truisms of business is the more ways customers can pay; the more likely you are to make the sale.

This is particularly true when something goes wrong – the customer hasn’t any cash, the till is jammed or the EFTPOS system is down.

Exactly this happened to thousands of businesses across south-west Victoria last week when a fire burned down the Warrnambool telephone exchange.

Unfortunately for the people and businesses of the surrounding region, much of the telephone, internet and Telstra’s mobile network runs through the burned out telephone exchange, sending the district back into the pre-telephone days.

This presented real problems as customers couldn’t use EFTPOS or get cash out of ATMs, while businesses struggled to get payrolls done or place orders with suppliers who couldn’t comprehend that it wasn’t possible to place orders over the net or by fax.

A hundred kilometres north of Warrnambool in the Grampians town of Dunkeld, a cafe worker told the ABC, “suppliers say ‘send a fax’ and you’re like ‘we can’t’ and they’re like ‘oh, we don’t want to handwrite it’.”

Those suppliers are a good example of not having the systems or staff in place to deal with ‘out of the box’ situations.

Unexpected events like the phone network being down for a week, major floods, devastating bushfires or zombie invasions will test businesses and it’s why having a real Business Continuity Plan (BCP) is important for business.

A workable BCP is one that identifies all the critical failure points for the business such as not having the internet for a week, a flooded office or, as happened to one of my clients, their entire building collapsing into the construction site next door.

The various state business agencies have guides on what to consider in a Business Continuity Plan including a good one from the South Australian government.

Regardless of how comprehensive a plan your business has, the most important part is going to be your people. If your organisation is staffed or managed by people who like to say “computer says no,” then they are going to be particularly useless when the computer is stone dead.

As the Warrnambool outage shows, unexpected business disruptions can come from anywhere, so flexible thinking and initiative is what matters in a crisis. It’s something worth thinking about with your staff and systems.

Similar posts:

Leaping seconds, new millennia

The leap second brings back memories of the Y2K frenzy

Along with a storm disrupting cloud computing services, last weekend also saw computer networks being disrupted by the leap second.

Servers needed to rebooted, websites froze and – as usual whenever there’s a technical glitch – airline check in systems fell over causing chaos for thousands for travellers.

It’s all very reminiscent of what we thought would happen with the Y2K bug. While sensible people didn’t think planes would fall from the sky, dams collapse and the world financial system grind to a halt (we had to wait another eight years for that), we did think there would be a lot of dumb little things to irritate us over the first few days of the year 2000.

That no real disruption happened, not even the airlines check in systems failed or tried to check in people for 1901, was credit to the entire IT industry. It a shame that the success in dealing with the complex unknowns of what was called the Y2K “bug” – which wasn’t really a bug but a feature – ended up being portrayed a scam by the entire IT sector.

A couple of years ago I was talking to a finance guy who claimed “the whole global financial crisis was a scam, just like Y2K.”

That view overlooks how the IT industry knew it had a problem and dealt with it, as opposed to the banksters and their friends in government who denied there was a problem right up to the moment it happened.

Of course it’s easy to ignore your business or industry has a problem if you know your friends in government will make sure your bonuses, holiday homes and private school fees will be guaranteed by the taxpayer, the taxpayers’ children and the taxpayers’ grandchildren.

Last weekend’s leap second and the cloud computing outage teach us that technology isn’t infallible and that things do go wrong.

For most of us when they do go wrong, we won’t have the government to bail us out.

This isn’t anything new. In any complex society, the unexpected can disrupt our comfortable way of living in ways we don’t expect. It’s something all of us should occasionally think about.

Similar posts:

Is the Virginia storm outage bad news for cloud computing?

A disaster or an opportunity for the cloud computing industry?

On the eve of the US Independence Day holiday weekend, the last thing you need is a storm taking out your services. Unfortunately a storm across Virginia did exactly that to one of Amazon’s key data centres, taking with it popular social media sites like Instagram and Pinterest along with the Netflix movie service.

Having a key data centre going down and knocking out the services that rely on it surely exposes one of cloud computing’s greatest weakness – or does it?

Last week I spoke to Eran Feigenbaum, Director of Security for Google Apps, who made the point “not all cloud providers are created equal”. The Virginia outage illustrates this.

Netflix, Pinterest and Instagram all made choices to solely rely on one data centre for key parts of their services leaving them exposed should a storm, earthquake or tsunami affect that location.

Eran introduced me to a term I hadn’t heard before – “shared fate zones” – a good example of which would be putting all your servers in Virginia where they can be knocked out by a storm, in California or Japan where an earthquake can disable them or solely around the Indian Ocean where a tsunami like that of 2004 could know them all out.

All the major cloud providers have the facility to spread loads across the globe for exactly this situation. The services affected by the Virginia storm chose not to do this and they eventually were caught out.

Events like this aren’t just an issue with cloud computing, or even technology in general. Storms, earthquakes, fires and many other natural or man made disasters are a fact of life which can disrupt business. If an earthquake hits your town, the question is how quickly can your business and customers get back to normal.

Distributing services is actually the cloud’s strength, it means we’re not tied to one office or location so we can get back to normal a lot quicker than those who have lost everything.

Computer networks being knocked out is nothing new, we’ve seen this plenty of times over the last fifty years as squirrels have chewed cables, technicians have pressed the wrong button or natural disasters have disabled data centres. By spreading the load, cloud computing services should make networks less prone to problems.

A lot of people will say in the next few days that Amazon’s outage illustrates the unreliability of cloud computing, it’s actually the opposite.

Similar posts: