A single point of failure

Where are the weak spots in your organisation?

If anyone had any doubts about the importance of technology to the modern business, they only have to ask one of Virgin Blue’s staff or customers about the last three days of disruption.

“An external supplier’s hardware failure” is the given reason for the problems and it shows how we all need to be conscious of the key “choke points” in our business processes where a disruption will quickly bring operations to a crawl or stop.

For any organisation risk arises when those choke points rely on one thing — it could be a person, a computer or a physical widget — for the system to keep running. Should that one item fail, then the organisation stops. In Virgin’s case that thing appears to have been a router or server controlling their booking systems.

A single point of failure is the Achilles heel of any organisation, anything one item that can disrupt operations has to be identified and contingencies developed so when a failure happens, and it will, the organisation can quickly move to a work around.

In Virgin’s case it appears they were prepared for a disruption of up to three hours but when the booking system outage dragged on for 21 hours their fallback procedures were simply overwhelmed.

We often think of these things as technically related but often it’s something more mundane like a burst watermain blocking access to your shop or only one person, who happens to be driving along the Gunbarrel Highway for the next six weeks, has the keys to the fuse box.

In fact those human points of failure, where only one person in the organisation knows the combination to the safe, the bank account PIN or the password to the company’s servers, are probably the riskiest points of failure of all.

Another common point of  failure is relying on supplier contracts and service level agreements. Warranties and indemnities are nice to have, assuming they are enforceable when you need them, but they won’t fix the damage to a company’s reputation when a crisis on Virgin’s scale hits.

Even if you have a guaranteed response time, as it appears Virgin had, you need to have something in place to keep the business running in the meantime. Also “response time” is how long it takes your supplier to start doing something about the problem, not the actual time to fix.

Regardless of how well we plan and how watertight our supplier contracts and SLAs are, crises happen and that’s when the quality of a business and its management are tested. One sure indicator of a poorly run, bureaucratic organisation is when management hide at the first sign of trouble.

For Virgin, that’s a good sign. I had to reluctantly call them yesterday to deal with a problem and ended up with a good customer experience.

The very helpful Ruby not only called me back when the line dropped out but she also revealed she was a PA, not a regular call centre worker and all the office staff, including managers, were manning the phones.

Ruby turned out to be a real gem, not only quickly fixing my problem but also wiping out the additional charges without prompting.

That at least is an encouraging sign about their organisation and I hope Ruby and her colleagues get a thank you from the man with the beard when the problems settle down.

Virgin’s problems though show us that as business owners and managers, we need to understand where the points of failure are in our organisations and how we would deal with them should bad luck strike.

You might want to walk around your organisation, sit down with your staff and work through where the points of failure, both human and technological, in your organisation may be.

Similar posts:

  • No Related Posts

Author: Paul Wallbank

Paul Wallbank is a speaker and writer charting how technology is changing society and business. Paul has four regular technology advice radio programs on ABC, a weekly column on the smartcompany.com.au website and has published seven books.

One thought on “A single point of failure”

Leave a Reply