Garbage in & out in big data & the internet of things | Paul Wallbank

UK tech site The Register reports that Google Flu Trends has been dismal failure with the service over-reporting the incidence of influenza by a factor of nearly 12.

The reason for this problem is the algorithm used to determine the existence of a flue outbreak is that it relies on people searching for the terms ‘flu’ or ‘influenza’ and it turns out we tend to over-react to a dose of the sniffles.

Google Flu Trends’ failure illustrates two important things about big data – the veracity of the data coming into the system and the validity of the assumptions underlying the algorithms processing the information.

In the case of Google Flu Trends both were flawed; the algorithm was based on incorrect assumptions while the incoming data was at best dubious.

The latter point is an important factor for the Internet of Machines. Instead of humans entering search terms, millions of sensors are pumping data into system so bad data from one sensor can have catastrophic effects on the rest of the network.

As managing data becomes a greater task for businesses and governments, making sure that data is trustworthy will be essential and the rules that govern how the information is used will have to be robust.

Hopefully the lessons of Google Flu Trends will save us from more serious mistakes as we come to depend on what algorithms tell us about the data.

Author: Paul Wallbank

Paul Wallbank is a speaker and writer charting how technology is changing society and business. Paul has four regular technology advice radio programs on ABC, a weekly column on the smartcompany.com.au website and has published seven books. View all posts by Paul Wallbank

Garbage In and Garbage Out

Like this:

Related

Author: Paul Wallbank

Leave a ReplyCancel reply

Worth reading? Then share this post.

Like this:

Related

Author: Paul Wallbank

Leave a ReplyCancel reply