One of the big problems during and after Hurricane Sandy was how the cell phone network fell over.
As the Wall Street Journal describes, many parts of New York and New Jersey still didn’t have mobile phone services several days after the storm.
Yang Yeng, a shopkeeper selling batteries, candles, and flashlights on the street in front of his still darkened shop in the East Village, said his T-Mobile phone was useless in the area. The situation, he said, reminded him of the occasional cellphone-service outages where he used to live, on the outskirts of a small city in southern China.
What’s often overlooked is that mobile networks are different products from a different era to the traditional landlines most of us grew up with.
The older landline phone systems used their own power and the batteries in most telephone exchanges had enough juice to supply the Plain Old Telephone Service (POTS). So in the event of a blackout most services kept running.
Of course POTS services could still be disrupted – a car could hit a pole on your street, those poles could burn down in a fire, your local exchange could be struck by lighting or a blackout could last longer than the telephone company’s batteries.
Most importantly, in times of major emergencies those exchanges would get overwhelmed by frantic callers trying to contact the authorities or their families.
All of the above would have happened during Hurricane Sandy, so it is somewhat unfair to single out the mobile networks for their ‘unreliability’.
There are some differences though with modern mobile and fibre based networks that shouldn’t be overlooked when understanding the reliability of these systems in times of crisis or disaster.
A hunger for power
Modern communications networks need far more power than the POTS network. Fiber repeaters, cell towers and the handsets themselves can’t be sustained in the way low powered rotary phones and mechanical telephone exchanges were.
The cost of providing and maintaining reliable batteries to these devices is a serious item for telcos and it’s no surprise they lobbied against laws mandating the use of them in cell phone towers.
Even if they were installed, the fibre connections to the towers are also subject to the same problem of needing power to connect them to the rest of the network.
Of course the problem of keeping power to your handset then kicks in. Many smartphones or cordless landline handsets struggle to keep a charge for 24 hours, further reducing their effectiveness during any outage that lasts more than a day.
Even if your cellphone does keep its charge and the local tower remains running and connected to the backbone, there’s no guarantee you can get a line out.
In this respect, the modern systems suffer the same problem as the old phone networks – there’s a limit to the traffic you can stuff down the pipe.
This isn’t news if you’ve tried to make a call on your mobile at half time at a sporting event or at the end of a big concert. If there’s too much traffic, then the system starts rationing bandwidth; some people get a line out while others don’t.
Another way of managing demand during high traffic times is to ‘prioritize’ what passes over the network – voice comes first, SMS second and data a distant last.
This is why on New Year’s Eve you might be able to call your mum, but you can’t post a Facebook update from your smartphone and all your text messages come through at 5am the following morning.
During emergencies it’s fair to assume that if the mobile network stays up, social networks won’t be the priority of the operators and this is something not understood by those advocating reliance of social networks during disasters.
No best efforts
Probably most important to understand is the difference between the utility culture of the POTS operators and the ‘best effort’ services offered by ISPs and many mobile phone companies.
Under the ‘utility model’, the telco was run the same way as the power company and water board – largely run by Engineers with a focus on ensuring the network stays up for 99.99% of the time.
That four or ‘five nines’ reliability is expensive and the step between each decimal point means an exponential increase in costs and spare capacity.
Over the last three decades the utilities themselves have seen a reduction of reliability as the costs of maintaining a network that has a 24 hour outage once every three years (99.9%)* over three times a year (99%) interfere with a company’s ability to pay management bonuses.
ISPs and most cell phone networks never really had this problem as their services are based upon ‘best effort’. If you read your contract, user agreement or condition of sale you’ll find the provider doesn’t really guarantee anything except to do their best in getting you a service – if they fail, tough luck.
As we become more connected, we have to understand the limitations of our communications networks. The assumptions those systems will be around when we need them could bring us unstuck.
*the definition of uptime and what constitutes an outage varies, the definition I’ve used is a 24 hour blackout or suspension of supply in any given area.