Tag: big data

  • Hacking the hacks

    Hacking the hacks

    Hacks and Hackers is an informal global network of meetings discussing the intersection of technology and journalism. The inaugural Sydney Hacks and Hackers meetup recently looked at how journalists use data and showed the challenges the news media face in an age where information isn’t scarce.

    The panel in Sydney were Sharona Coutts, Investigative Reporter at Global Mail; Edmund Tadros, Data Journalist at Australian Financial Review; and Courtney Hohne, Director of Communications Google Australia.

    Courtney looked at some of the big data opportunities for journalists, a topic covered in the Closed Data Doors post. One of the areas she highlighted was emergency services sending out PDFs of updates during crises like bushfires and floods.

    Listening to Sharona and Edmund, it was clear they were two overworked but keen young journalists who had neither the resources or the training to deal with the data flowing into their organisations.

    Because journalists in modern media organisations don’t have the skills or the resources to properly understand and use raw data the public ends up with relatively trivial stories like league tables of school exam results or council building approvals – both of which are important, but are misread and used to confect outrage against incompetent public servants and duplicitous politicians.

    For the public servant, school teacher or even bus driver it’s understandable they don’t want their performance measured if the measure is going to be misused and possibly jeopardize their jobs.

    A deeper problem for journalism is the skills of the trade. Both Edmund and Sharona are smart young journos who will go far; but both admitted they had no training in statistic and mathematics.

    Even more worrying are the older journalists, when I mentioned the lack of older and more experienced journalists to the organiser she said none would agree to come on the panel. One suspects this is because forty and fifty year old journalists have even fewer data skills than their young colleagues.

    This lack of skills or understanding of data is probably one of the biggest challenges facing the media. In a world awash with data, the role of journalists is to filter the feed, interpret and explain it.

    Pure reportage is being overwhelmed by the sheer quantity of news and information available; the 1980s model of opinion based journalism is also failing as the audience now realise they have a voice, and better informed opinions, than the experts and columnists.

    One of the notable themes that seemed to jump out of the evening was the divide between journalists and the wider community that always seems to appear when the future of journalism is discussed.

    Usually this expressed in terms of those employed by major mastheads sneering at “citizen journalists” but at Hacks and Hackers it was about “geeks and journos coming together.”

    In reality there is no divide – good analytic and technology skills should be as much a part of journalism as any other field in a modern economy.

    The fear from the Sydney Hacks and Hackers night is that the media industry is one of the sectors that’s failing to deal with technological change.

    It’s hard not to think that journalists wondering at the power of spreadsheets and pivot tables is like 18th Century blacksmiths trying to figure out how steam engines can make better horseshoes.

    For an industry that is so deeply challenged by technological change, it seems the news media is still unprepared for the changes that hit nearly a decade ago.

    Similar posts:

  • Deep mining business data

    Deep mining business data

    This post originally appeared as Mining Business Data: Get ready to drill and excavate, the June 28, 2012 post on Smartcompany.

    The IT industry loves buzzwords and the phrase coming into fashion is “big data”. Forget “social media” or “cloud computing”, much of what you’ll be reading about in columns like this over the next few years will be about mining the information piling into our businesses.

    Big data’s power is illustrated in yesterday’s report that Mac users will pay more for hotels than those on Windows systems. So travel site Orbitz now plans to headline more expensive options to visitors using Apple computers.

    While social media and cloud computing are falling out of favour, we shouldn’t discount those old-fashioned terms as they are two of the main drivers of big data. Cloud computing makes it possible to crunch data cheaply, while social media is generating even more data for people to play with.

    Sydney business Roamz is a good illustration of how social media, cloud computing and big data come together. Born out of founder Jonathan Barouch’s desire to find local activities for his young child, Roamz pulls together data from Facebook, Twitter and Foursquare to build a picture of what’s interesting in your neighbourhood.

    It’s no coincidence direct marketing giant Salmat has invested in Roamz, as the data being gathered allows companies to paint a comprehensive picture of what their customers like.

    Another business making sense of big data is Kaggle, set up by former Reserve Bank economist Anthony Goldbloom. Kaggle is a crowdsourcing service which runs data analysis competitions on business problems through to things like HIV research, chess ratings and dark matter exploration.

    Like most things in the computing industry, these services were only available to those who could afford supercomputers a few years ago, today they’re available to anyone with a credit card and internet connection.

    The era of big data might help us overcome Pareto’s Principle – otherwise known as the 80/20 rule – that 80% of your profits come from 20% of your customers. We can also be sure that 80% of our problems come from a different 20% of clients.

    Having the ability to crunch numbers quickly means we can identify the good payers and problem customers before we do work for them. It might also mean we beat another business truism by finally being able to identify the 50% of our advertising budget that is being wasted.

    Even if you don’t think your business is ready to dive into the world of big data, it’s worthwhile at least having a look at your website analytics to see what your customers are looking at.

    Who knows? Like Orbitz, you might identify those cashed up Mac users who love spending money.

    Similar posts:

  • Connecting the data dots

    Connecting the data dots

    One of the connected world’s weaknesses is its fragmented as various silos of data appear in the different social and cloud services.

    Bringing those sources together in a way that’s useful and relevant is one big opportunity for entrepreneurs.

    Sydney company Roamz is one of the businesses looking at this opportunity by bringing together a user’s Facebook, Twitter and Foursquare feeds to figure what interesting stuff is happening locally.

    Roamz’s CEO and founder Jonathan Barouch has a vision to “cut out the noise” from social media services by “curating and cleaning the data”.

    The idea of curation isn’t new in the online world, this is probably one of the biggest challenges for everyone on the web as we find ourselves swamped with data. To date, much of the idea of ‘curation’ has been around news sources where services like Google News try to deliver relevant current affairs to the user’s desktop.

    Social media sites are particularly in need of curation, particularly given your friends in Nevada are much help when you’re looking for a good coffee shop in Melbourne.

    This is the problem Roamz seeks to solve and we’re seeing this with various other services, not least the social media platforms themselves as Facebook tries to extend its reach and Google attempts to integrate their local services with the Zagat restaurant review system and Google+.

    Some would dismiss these services as “first world problems”, after all who cares about twittering hipsters trying to find a single origin, fair trade soy latte in Broadmeadows?

    There’s a point in that view, although there is a much bigger problem for businesses in this fragmented data world in harnessing and validating various sources of market intelligence.

    For businesses that get this right, they’ll be able to target advertising and marketing much more effectively while being able being able to tap into what their customers think and want.

    It’s no accident therefore that one of Roamz’s major investors is consumer communications giant Salmat, who can deliver great value to their corporate customers through supplying this data and market intelligence.

    The next IT buzzphrase is “Big Data” where businesses deal with this flood of information that is swamping all of us, by being able to understand customers and their behaviour things become far more efficient and cost effective.

    Bringing data together and making sense of the results is the big challenge of our times, those who can solve the problem will be among the next generation of business leaders.

    Similar posts:

  • Triangulating privacy out of our lives

    Triangulating privacy out of our lives

    Lost among the noise of Facebook’s rumoured plans to launch a kids’ network, there’s quiet pressures developing as consumers start to realise the value of their data – the pressure to regulate social media.

    In his Rethinking Privacy in an Era of Big Data, New York Times writer Quentin Hardy raises some of the issues about the data which is being collected about us.

    One of the big areas is triangulation – building a picture of somebody based upon seemingly unrelated data. Quentin explains it in the example of somebody who might be looking for a job.

    There other ways in which we can lose control of our privacy now. By triangulating different sets of data (you are suddenly asking lots of people on LinkedIn for endorsements on you as a worker, and on Foursquare you seem to be checking in at midday near a competitor’s location), people can now conclude things about you (you’re probably interviewing for a job there) that are radically different from either set of public information.

    The key word of course is “conclude” – we base an assumption on what we think we know. It could turn out those LinkedIn endorsements could be part of a performance review and the competitor’s location could right next door to a hot new lunch spot.

    We should also keep in mind the value of this data is asymmetric as the value of this data to a third party is low, if anything. But to the individual it could mean losing a job and other major consequences.

    A good example of this is the story of how a UK hospital trust lost highly sensitive health records of thousands of patients, including those being treated for HIV.

    The trust ended up being fined £325,000 but that fine is trivial compared to the massive individual cost from just one of those records being released.

    Fines are a lousy way of enforcing privacy anyway, as the financial penalties are just passed onto shareholders or taxpayers.

    The only meaningful sanction for failures like the Brighton General Hospital breach are holding individuals, particularly managers, personally responsible.

    As we saw in the successive Sony security breaches last year, most organisations aren’t interested in holding their senior managers responsible for even the most egregious data failures.

    This failure of the corporate sector to protect consumer data will almost certainly drive calls for government regulation and sanctions.

    Microsoft researcher Danah Boyd  flags this regulation issue in Quentin Hardy’s New York Times piece, saying “Regulation is coming,” she says. “You may not like it, you may close your eyes and hold your nose, but it is coming.”

    Danah also makes an important point that users – particularly kids – have developed tactics to obscure their ‘digital footprints’.

    For Danah, and others trying to understand what is happening online, this causes a problem, “When I started doing my fieldwork I could tell you what people were talking about. Now I can’t.”

    These tactics of creating dummy social media profiles and using euphemisms are a huge threat to the business plans of social media services and the “identity services” desired by Google’s Eric Schmidt.

    As data becomes less reliable, or more difficult to triangulate, the value of it to advertisers falls.

    It may well be that regulation of social media and web services ends up not being necessary as users become more net savvy. For medical and other personal data though, it’s clear we have to rethink the way we use and store it.

    Similar posts:

  • Do you want to be the personal lubricant guy?

    Do you want to be the personal lubricant guy?

    Nick Bergas is a multimedia producer in Iowa City, but to Facebook he’s a live advertisement for personal lubricant.

    As the New York Times reports, last Valentines Day Nick saw an Amazon listing for a 55 gallon drum of personal lubricant, ticked the product’s Facebook “Like” button  and added a witty comment to his friends.

    Shortly afterwards, Nick’s face started appearing in Facebook sponsored posts for big drums of personal lubricant.

    Last year I wrote The Privacy Processors on how Facebook is using our personal data and Nick’s story is a good example of how every like, relationship or comment is potential fodder for Facebook’s marketing platform.

    While Nick seems pretty chilled about his Facebook celebrity, for some it might not be so benign.

    As we’ve seen for student teachers and others, an innocent or even funny posting may be a problem to those without perspective or a sense of humour.

    For Facebook and other social media services, Nick’s story also illustrates a problem – that of “Garbage In, Garbage Out”.

    While one of Facebook’s major assets is its huge user database, there’s no guarantee the data is accurate or useful.

    Selling Nick’s details to a bulk medical lubricant wholesaler is pretty pointless, but that sort of intelligence is key to the future value of Facebook.

    That much of the data gathered is the flaw at the heart of Facebook’s bid data aspirations and Google’s hopes to become an identity engine with Google+.

    For us mere individuals, the lesson is we need to be a little bit careful about pressing those “like” buttons; explaining your affinity with bulk lubricants could be a bit tricky with your mum or partner.

    Similar posts: