Deep mining business data

Big data means big opportunities for businesses prepared to look at what their customers are doing

This post originally appeared as Mining Business Data: Get ready to drill and excavate, the June 28, 2012 post on Smartcompany.

The IT industry loves buzzwords and the phrase coming into fashion is “big data”. Forget “social media” or “cloud computing”, much of what you’ll be reading about in columns like this over the next few years will be about mining the information piling into our businesses.

Big data’s power is illustrated in yesterday’s report that Mac users will pay more for hotels than those on Windows systems. So travel site Orbitz now plans to headline more expensive options to visitors using Apple computers.

While social media and cloud computing are falling out of favour, we shouldn’t discount those old-fashioned terms as they are two of the main drivers of big data. Cloud computing makes it possible to crunch data cheaply, while social media is generating even more data for people to play with.

Sydney business Roamz is a good illustration of how social media, cloud computing and big data come together. Born out of founder Jonathan Barouch’s desire to find local activities for his young child, Roamz pulls together data from Facebook, Twitter and Foursquare to build a picture of what’s interesting in your neighbourhood.

It’s no coincidence direct marketing giant Salmat has invested in Roamz, as the data being gathered allows companies to paint a comprehensive picture of what their customers like.

Another business making sense of big data is Kaggle, set up by former Reserve Bank economist Anthony Goldbloom. Kaggle is a crowdsourcing service which runs data analysis competitions on business problems through to things like HIV research, chess ratings and dark matter exploration.

Like most things in the computing industry, these services were only available to those who could afford supercomputers a few years ago, today they’re available to anyone with a credit card and internet connection.

The era of big data might help us overcome Pareto’s Principle – otherwise known as the 80/20 rule – that 80% of your profits come from 20% of your customers. We can also be sure that 80% of our problems come from a different 20% of clients.

Having the ability to crunch numbers quickly means we can identify the good payers and problem customers before we do work for them. It might also mean we beat another business truism by finally being able to identify the 50% of our advertising budget that is being wasted.

Even if you don’t think your business is ready to dive into the world of big data, it’s worthwhile at least having a look at your website analytics to see what your customers are looking at.

Who knows? Like Orbitz, you might identify those cashed up Mac users who love spending money.

Connecting the data dots

The age of big data means big opportunities

One of the connected world’s weaknesses is its fragmented as various silos of data appear in the different social and cloud services.

Bringing those sources together in a way that’s useful and relevant is one big opportunity for entrepreneurs.

Sydney company Roamz is one of the businesses looking at this opportunity by bringing together a user’s Facebook, Twitter and Foursquare feeds to figure what interesting stuff is happening locally.

Roamz’s CEO and founder Jonathan Barouch has a vision to “cut out the noise” from social media services by “curating and cleaning the data”.

The idea of curation isn’t new in the online world, this is probably one of the biggest challenges for everyone on the web as we find ourselves swamped with data. To date, much of the idea of ‘curation’ has been around news sources where services like Google News try to deliver relevant current affairs to the user’s desktop.

Social media sites are particularly in need of curation, particularly given your friends in Nevada are much help when you’re looking for a good coffee shop in Melbourne.

This is the problem Roamz seeks to solve and we’re seeing this with various other services, not least the social media platforms themselves as Facebook tries to extend its reach and Google attempts to integrate their local services with the Zagat restaurant review system and Google+.

Some would dismiss these services as “first world problems”, after all who cares about twittering hipsters trying to find a single origin, fair trade soy latte in Broadmeadows?

There’s a point in that view, although there is a much bigger problem for businesses in this fragmented data world in harnessing and validating various sources of market intelligence.

For businesses that get this right, they’ll be able to target advertising and marketing much more effectively while being able being able to tap into what their customers think and want.

It’s no accident therefore that one of Roamz’s major investors is consumer communications giant Salmat, who can deliver great value to their corporate customers through supplying this data and market intelligence.

The next IT buzzphrase is “Big Data” where businesses deal with this flood of information that is swamping all of us, by being able to understand customers and their behaviour things become far more efficient and cost effective.

Bringing data together and making sense of the results is the big challenge of our times, those who can solve the problem will be among the next generation of business leaders.

Triangulating privacy out of our lives

Social media sites will have to deal with increased government regulation.

Lost among the noise of Facebook’s rumoured plans to launch a kids’ network, there’s quiet pressures developing as consumers start to realise the value of their data – the pressure to regulate social media.

In his Rethinking Privacy in an Era of Big Data, New York Times writer Quentin Hardy raises some of the issues about the data which is being collected about us.

One of the big areas is triangulation – building a picture of somebody based upon seemingly unrelated data. Quentin explains it in the example of somebody who might be looking for a job.

There other ways in which we can lose control of our privacy now. By triangulating different sets of data (you are suddenly asking lots of people on LinkedIn for endorsements on you as a worker, and on Foursquare you seem to be checking in at midday near a competitor’s location), people can now conclude things about you (you’re probably interviewing for a job there) that are radically different from either set of public information.

The key word of course is “conclude” – we base an assumption on what we think we know. It could turn out those LinkedIn endorsements could be part of a performance review and the competitor’s location could right next door to a hot new lunch spot.

We should also keep in mind the value of this data is asymmetric as the value of this data to a third party is low, if anything. But to the individual it could mean losing a job and other major consequences.

A good example of this is the story of how a UK hospital trust lost highly sensitive health records of thousands of patients, including those being treated for HIV.

The trust ended up being fined £325,000 but that fine is trivial compared to the massive individual cost from just one of those records being released.

Fines are a lousy way of enforcing privacy anyway, as the financial penalties are just passed onto shareholders or taxpayers.

The only meaningful sanction for failures like the Brighton General Hospital breach are holding individuals, particularly managers, personally responsible.

As we saw in the successive Sony security breaches last year, most organisations aren’t interested in holding their senior managers responsible for even the most egregious data failures.

This failure of the corporate sector to protect consumer data will almost certainly drive calls for government regulation and sanctions.

Microsoft researcher Danah Boyd  flags this regulation issue in Quentin Hardy’s New York Times piece, saying “Regulation is coming,” she says. “You may not like it, you may close your eyes and hold your nose, but it is coming.”

Danah also makes an important point that users – particularly kids – have developed tactics to obscure their ‘digital footprints’.

For Danah, and others trying to understand what is happening online, this causes a problem, “When I started doing my fieldwork I could tell you what people were talking about. Now I can’t.”

These tactics of creating dummy social media profiles and using euphemisms are a huge threat to the business plans of social media services and the “identity services” desired by Google’s Eric Schmidt.

As data becomes less reliable, or more difficult to triangulate, the value of it to advertisers falls.

It may well be that regulation of social media and web services ends up not being necessary as users become more net savvy. For medical and other personal data though, it’s clear we have to rethink the way we use and store it.

Do you want to be the personal lubricant guy?

A reminder why you need to be careful with your Facebook likes.

Nick Bergas is a multimedia producer in Iowa City, but to Facebook he’s a live advertisement for personal lubricant.

As the New York Times reports, last Valentines Day Nick saw an Amazon listing for a 55 gallon drum of personal lubricant, ticked the product’s Facebook “Like” button  and added a witty comment to his friends.

Shortly afterwards, Nick’s face started appearing in Facebook sponsored posts for big drums of personal lubricant.

Last year I wrote The Privacy Processors on how Facebook is using our personal data and Nick’s story is a good example of how every like, relationship or comment is potential fodder for Facebook’s marketing platform.

While Nick seems pretty chilled about his Facebook celebrity, for some it might not be so benign.

As we’ve seen for student teachers and others, an innocent or even funny posting may be a problem to those without perspective or a sense of humour.

For Facebook and other social media services, Nick’s story also illustrates a problem – that of “Garbage In, Garbage Out”.

While one of Facebook’s major assets is its huge user database, there’s no guarantee the data is accurate or useful.

Selling Nick’s details to a bulk medical lubricant wholesaler is pretty pointless, but that sort of intelligence is key to the future value of Facebook.

That much of the data gathered is the flaw at the heart of Facebook’s bid data aspirations and Google’s hopes to become an identity engine with Google+.

For us mere individuals, the lesson is we need to be a little bit careful about pressing those “like” buttons; explaining your affinity with bulk lubricants could be a bit tricky with your mum or partner.

Eroding business silos

Knowledge is power, and the businesses who can share it are those who will define the 21st Century.

During our ABC radio discussion on politics and social media with Jeff Jarvis, we inevitably came around to the issue of sharing information.

We’ve covered the risks of personal sharing extensively and Jeff’s view is that our perceptions of privacy are evolving as we explore what is acceptable or tolerable in an information rich world.

Overlooked in this discussion is just how important sharing is for businesses – particularly in breaking down silos within an organisation.

As organisations grow, silos develop as various groups or departments grow to address specific functions. It’s a natural process.

However silos can damage businesses as valuable business knowledge is kept within the group rather than shared with the entire organisation.

This is the opportunity we see now in the various cloud computing, social media and big data tools that have developed to help people, gather, curate and share information.

Today there is no excuse for critical customer information sitting in the call centre logs not being available to marketing, sales or management teams. That is just one example of thousands.

Over time we’ll see businesses owners and managers develop the skills and tools to use data more effectively. This is already happening as many IT people move from Information Technology to Knowledge Management.

Business silos won’t ever be fully eliminated; in many ways they are necessary as you can’t expect the company accountant to know everything the customer service or sales staff do.

Those businesses who are successful will be those who overcome internal politics and resist the managerial urge to build little empires, information is too important to be hoarded by middle management princelings.

In the 19th Century power came in the form of steam engines, today it comes in knowledge. How well are you harnessing the power in your business?

Forget Plastics, today it’s Big Data

Big Data is the IT industry’s latest buzzword but it’s been sitting on our desktop all along

“Plastics” was the career advice to uni students in the 1967 movie The Graduate. Today the same advice to a smart young entrepreneur would be “big data”.

Big data is the current buzzword for the IT industry, we’re seeing start-ups with cool tools popping up and whole new job descriptions to manage it, while big and small businesses ponder how to use another technology in their operations.

At the end of the month, the third of the City of Sydney’s 2012 Let’s Talk Business series will see SmartCompany’s James Thomson among others discussing how data drives business.

How we use data in our business is something we’ve had to come to grips with for ages, but many of us haven’t really started to find those nuggets of value in our databases.

We’ve actually been in the era of big data for decades since computers were introduced in the workplace. One thing that PCs do very well is gather and store information.

Today computerised point-of-sales systems, database software, loyalty programs and web-tracking tools mean we have a massive amount of data about our clients at our fingertips.

As computers get more powerful and cloud-based services start making detailed data analysis more available, we’re going to see even more data pouring into our businesses.

Social media services add to the data deluge as they gather, giving even more intelligence about our markets, individual customers and the performance of our businesses.

The problem is that many of us are already overwhelmed by what we have. The thought of even more data we can’t use causes many managers and business owners to hide under their desks and weep.

An article in the MIT’s Technology Review about Peter Fader, co-director of the Wharton Customer Analytics Initiative at the University of Pennsylvania looked at this problem.

Professor’s Fader’s view is that most businesses have enough data – the problem is managing what we have, along with the risk of trying to extrapolate too much from historical information.

To deal with this overload we’re seeing companies like Kaggle starting-up to help us mine this data and get useful information about our businesses and customers.

What these data-mining companies are promising is the ability to see the patterns in what appears to be just a mass of confusing data.

Already we’re seeing businesses that can connect the dots get a head start on their slower competitors who don’t appreciate the value locked in their databases and CRMs.

Making sense of the data we’re accumulating is the real challenge. If we’re not paying attention to what we already have then there’s little point in gathering more.

Tickets for How Your Customer Data Can Drive New Business at the Sydney Town Hall on May 29 are still available.