Skills, data scientists and the decade’s big IT trends

As the amount data flooding into our lives explodes, we’ll all need to think about how we can get the skills to manage and understand data.

As we all get buried under a tsunami of data, the challenge is managing it. The MIT Technology Review this week looks at the rise of the data scientist, a job title unknown a few years ago.

The problem for industry is the skill sets required to become a data scientist are fairly esoteric.

Data scientist has become a popular job title partly because it has helped pull together a growing number of haphazardly defined and overlapping job roles, says Jake Klamka, who runs a six-week fellowship to place PhDs from fields like math, astrophysics, and even neuroscience in such jobs. “We have anyone who works with a lot of data in their research,” Klamka says. “They need to know how to program, but they also have to have strong communications skills and curiosity.”

Over the last twenty years we’ve done a pretty poor job teaching maths and statistics which is going to create a skills shortage as industry struggles to find people qualified to figure out what all of this data means.

While Big Data might be to this decade what plastics were to the 1960s, it’s not the only technology change that’s affecting business as the McKinsey Quarterly describes the ten IT trends for the decade ahead.

The thing that really stands out with McKinsey’s predictions is the degree of reskilling the workforce is going to need, today’s workers are going to need an understanding of programming, logic and statistics as much the kids currently at school.

If you’re planning on being in the workforce at the end of this decade right now may be the time to consider getting some of these skills.

Just as businesses will be separated by how they use Big Data, workers may too find those skills divide the winners from the losers.

As the amount of data flooding into our lives explodes, we’ll all need to think about how we can get the skills to manage and understand data.

Can maps change the way we work?

Big data and mobile computing are changing the way business operates as maps become an important part of our normal work and leisure time.

“Work the Way You Live” is Google’s motto for their enterprise maps service which the search engine giant hopes to make as ubiquitous in business as it is in the home.

At Google Atmosphere the company showed off their mapping technology and how it can be used by large organisation. It’s a compelling story.

The technology behind Google Maps is impressive – twenty petabytes of images, one billion active monthly users, 1.6 million map tiles served every second and a target of getting those tiles onto the users screen within ten milliseconds.

Maps are one of the Big Data applications that cheap computing makes possible, until a few years ago even desktop computers would have struggled with the sort of mapping technology that we take for granted on our smartphones today.

Rethinking products

google-street-view-enabled-treadmill

Adding mapping technologies to products allows businesses to rethink their products. A good example of this is the internet connected treadmill.

Using the treadmill a jogger, or a walker, can map out a route anywhere in the world and the screen will show them the Google Street View as they travel along the route. The treadmill even adjusts to the changing gradients.

The Google Maps driven treadmill is a trivial example of the internet of machines, but it gives a hint of what’s possible.

The search for truth

ground-truth-and-google-maps

The success of a map depends on whether it can be trusted – this is what caught Apple out with their mapping application which was released before it was ready for prime time. Google, and most cartographers, take seriously errors and changes.

In the early days of Google Maps, the company would pass errors and changes onto the private and government mapping providers they licensed the data from. It could take months to fix a problem.

“It was really hard, you have to get maps from all over the world to create the product,” says Louis Perrochon, the Engineering director of google maps for business.

“That’s a limitation if you work with third party data so we started a project called Ground Truth where we build our own maps.”

Google pulls together its Street View data, satellite images and information sent in from the public through their Map Maker site and the Maps Engine Lite to build an accurate map of an area.

Changing consumer behaviour

Having accurate and accessible maps has changed the way consumers have behaved; “this revolution hasn’t happened slowly,” says Google Enterprise Directore Richard Suhr, “it’s happened really quickly.”

“Customers have become savvy about spatial. What this means is that businesses are starting to rethink the problem.”

“What are the exciting things I can do with maps, what else can I do with my data.”

That’s a big question of all businesses – how they use the massive amount of information in their organisation will mark the winners from also runs over the next decade. Maps are one way to visualise their data.

While Google Atmosphere was a marketing event for the companies mapping technologies the message is clear – mapping is changing the way we work and play and it’s affecting business.

How is mapping changing the way your business works?

Sports cars, the cloud and the need for broadband

How the V8 Supercar races use the internet and networks shows why businesses need reliable communications and the way organisations are using cloud computing.

How the V8 Supercar races use the internet and networks shows why businesses need reliable communications and the way organisations are using cloud computing.

My relationship with sports cars is similar to horses – I have a vague idea of which end water goes in and where not to stand.

So Microsoft’s invite to the Launceston V8 Supercars to showcase their Office 365 cloud service as the race’s official sponsor wasn’t expected but it was a good opportunity to see how a sports organisation uses modern technology.

Riding the cloud

V8 Supercars David Malone and Peter Trimble

At the opening media conference V8 Supercars CEO David Malone and Finance Director Peter Trimble described the IT problems the organisation had in the early days.

We were penny wise and pound foolish” said Peter about their small business system that couldn’t grow with the event.

To properly meet their needs V8 Supercars would have needed a bank of servers, cumbersome remote access software and a full time team of several IT staff for their scattered workforce and constantly changing locations.

With cloud services, they eliminated many IT costs while simplifying their systems.

That staff can now access documents regardless of location is a very good case study of where the cloud works well and understandable that Microsoft wanted to show off what their services can do.

Networking the cars

When challenged about the point of car racing, enthusiasts cite how the sport is a test bed for the motor industry.

The motor industry is one sector leading the internet of machines with one car manufacturing executive recently describing the modern motor vehicle as being a “computer platforms” on wheels.

Pit crews monitoring in car systems
Pit crews monitoring in car systems

Eventually we’ll see our cars connected to the net and reporting everything from the engine’s servicing needs to the driver’s musical tastes.

That’s reality in today’s high performance racing, both the drivers and the cars are in constant contact with the crews as sensors report everything from engine performance to the foot pressure the driver is putting on the accelerator pedal.

As continuous data feeds from the cars is essential to the teams the event has its own trackside network with receivers located along the course that are used for both vehicle telemetry and the video feeds from both car mounted and fixed cameras.

Owning the rights

In what’s becoming the future of sports broadcasting, the V8 Supercars organisers run their own camera crews and provide the feed to their broadcast partners and media outlets.

This allows them to control all the rights across TV, cable and online channels.

Having full control of the pictures also gives the V8 Supercars more revenue through signage and sponsorship by guaranteeing advertising placements which wouldn’t be available if they didn’t manage the feed.

Connectivity matters

v8-supercars-launceston-communications-cable
Spaghetti Junction as the various feeds come together

Getting the images out to the media and broadcast partners along with delivering the in car data to the racing teams is major challenge for organisers. The communications centres resemble a giant bowl of cable spaghetti as various groups plug into the network.

It’s no coincidence that part of the deals the V8 Supercar management strike with track owners and governments includes providing fiber and microwave links to the venue.

That single factor illustrates how vital communications links are to a modern sporting event.

Another important factor is that everything will be packed up and taken away. Following Launceston, the entire show is packed up and moved onto Auckland, New Zealand. This in itself is a major logistic challenge which would fail without good connectivity and reliable systems.

v8-supercars-launceston-truck-fleet
the fleet of trucks ready to move on

It’s easy to dismiss the V8 Supercars as a bunch of testosterone driven rev-heads, but the challenges in staging these complex events fifteen times a year shouldn’t be underestimated.

We also shouldn’t underestimate how important communication links are to any business. It’s why debates about the need for high speed internet services are last century’s discussion.

Fifty trillion shades of grey

Something that’s missed when we talk about Big Data is the risk of false positives – if you dip into the stream, you can prove anything against person.

If you give me six lines written by the hand of the most honest of men, I will find something in them which will hang him said the 17th Century French politician Cardinal Richelieu.

Today those six lines could be written on a social media site or be six disparate points drawn from a database. Without context those six lines could condemn us.

Something that’s missed when we talk about Big Data is the risk of false positives – if you dip into the stream, you can prove anything against person.

The world isn’t black or white, there are fifty trillion shades of gray and that’s why it’s important to think before posting an image on the web, firing someone or calling the cops.

In an era where we’re quick to judge and condemn people, the stakes are very high.

Recruiting big data

Software company Evolv is an example of how businesses can use big data

One of the predictions for 2020 is that decade’s business successes will be those who use big data well.

A good example of a big data tool is recruitment software Evolv that helps businesses predict not only the best person to hire but also who is likely to leave the organisation.

For employee retention, Evolv looks at a range of variables which can include anything from gas prices and social media usage to local unemployment rates then pulls these together to predict which staff are most likely to leave.

“It’s hard to understand why it’s radically predictive, but it’s radically predictive,” Venture Beat quotes Jim Meyerle, Evolv’s cofounder.

There are some downsides in such software though – as some of the comments to the VentureBeat story point out – a blind faith in an alogrithm can destroy company morale and much more.

Recruiters as an industry haven’t a good track record in using data well, while they’ve had candidate databases for two decades and stories abound of poor use of keyword searches carried out by lazy or incompetent headhunters. The same is now happening with agencies trawling LinkedIn for candidates.

Using these tools and data correctly going to separate successful recruitment agencies and HR departments from the also-rans.

It’s the same in most businesses – the tools are available and knowing them how to use them properly will be a key skill for this decade.

Job classifieds image courtesy of Markinpool through SXC.HU

You call that a graph?

A good chart can help tell a story, all too often though graphs are designed to mislead.

One way to illustrate a story is with charts. All too often though misleading graphs are used to make an incorrect point.

A Verge story on Groupon shows how to get graphs right – clear, simple and tells the story of how the group buying service’s valuation soared and then plunged while it has never really been profitable.

The vertical axis is the key to getting a graph right, cutting off most of the y-axis’ range is an easy way to mislead people with graphs. In this case you can see just the extent of Groupon’s valuation, profit and loss over the company’s short but troubled history.

Since its inception, The Verge has been showing other sites how to tell stories online, their Scamworld story exposing the world of affiliate internet marketing sets the bar.

Using graphs well is another area where The Verge is showing the rest of the media – including newspapers – how to do things well.

For Groupon, things don’t look so good. As The Verge story points out, the company’s income largely tracked its workforce which grew from 126 at the start of 2010 to over 5,000 by April of 2011. Which illustrates how the business was tied into sales teams generating turnover.

The spectacular growth of Groupon and other copycat businesses couldn’t last and hasn’t. The challenge for Groupon’s managers is to now build a sustainable business.

For investors, those graphs of Groupon’s growth were a compelling story. Which is another reason why we all need to take care with what we think the charts tell us.

Graph image courtesy of Striker_72 on SXC.HU

Dealing with the data explosion

Supply the mobile base stations for data hungry customers is one of the great challenges for telcos. How they resolve this will create some unusual alliances.

“Last year’s mobile data traffic was nearly twelve times the size of the entire global Internet in 2000.”

That little factoid from Cisco’s 2013 Virtual Networking Index illustrates how the business world is evolving as various wireless, fibre and satellite communications technologies are delivering faster access to businesses and households.

Mobile data growth isn’t slowing; Cisco estimate global mobile data traffic was estimated at 885 petabytes a month and Cisco estimate it will grow fourteen fold over the next five years.

Speaking at the Australian Cisco Live Conference, Dr. Robert Pepper, Cisco Vice President of Global Technology Policy and Kevin Bloch, Chief Techincal Officer of  Cisco Australia and New Zealand, walked the local media through some of the Asia-Pacific results of Virtual Networking Index.

Dealing with these sort of data loads is going to challenge Telcos who were hit badly by the introduction of the smartphone and the demands it put on their cellphone networks.

A way to deal with the data load are heterogeneous networks, or HetNets, where phones automatically switch from the telcos’ cellphone systems to local wireless networks without the caller noticing.

The challenge with that is what’s in it for the private property owners whose networks the telcos will need to access for the HetNets to work.

One of the solutions in Dr Pepper’s opinion is to give business owners access to the rich data the telcos will be gathering on the customers using the HetNets.

This Big Data idea ties into PayPal’s view of future commerce and shows just how powerful pulling together disparate strands of information is going to be for businesses in the near future.

But many landlords and wireless network owners are going to want more than just access to the some of the telco data — we can also be sure that the phone companies are going to be careful about what customer data they share with their partners.

It may well be that we’ll see telcos providing free high capacity fibre connections and wireless networks into shopping malls, football stadiums, hotels and other high traffic locations so they can capture high value smartphone users.

One thing is for sure and that’s fibre connections are necessary to carry the data load.

Anyone who thinks the future of broadband lies in wireless networks has to understand that the connections to the base stations doesn’t magically happen — high speed fibre is essential to carry the signals.

Getting both the fibre and the wireless base stations is going to be one of the challenges for telcos and their data hungry customers over the next decade.

Paul travelled to the Cisco Live event in Melbourne courtesy of Cisco Systems.

Retail and the internet of machines

Paypal and eBay are using the Internet of machines to put service station cashiers out of work.

Online retail and payment giants Ebay and PayPal hosted a media lunch in Sydney yesterday to publicise their Australian Business Update.

While eBay dominates the online selling market, PayPal’s position in the payment market place is extremely powerful with Internet monitoring company Comscore reporting in their Digital Wallet Roadmap how PayPal dominates the US market and does likewise in other markets like Australia.

PayPal's US market lead

Their update confirms the trends which have been obvious for some time, particularly in how mobile devices are now driving retail. eBay’s research indicates properly implemented multichannel strategies drives six times more sales than just having an online presence.

What was particularly notable with eBay’s presentation was how the Internet of Machines is changing the retail and logistics industries as smartphones and connected point of sales systems are cutting out jobs and middle men.

Paypal are particularly proud of their US partnership with cash register manufacturer NCR that integrates smartphone payments with the point of sales systems in restaurants, convenience stores and gas stations.

eBay illustrated this with their examples of coupon offers being tied to smartphone payment systems so people paying for gas with their smartphone get a voucher offer for various up sells.

Studies in the US have found a $10 offer can result in sales of up to $100. A pretty compelling deal for most merchants.

With these technologies, we’re seeing how connected machines are changing even the most mundane business tasks.

It may well be that the days of the service station cashier are numbered; it’s quite possible that in one generation we’ll have gone from full staffed gas stations to totally automated facilities.

The example of gas station attendants and cashiers is just one example of how automation is changing many retail and sales tasks. It would be a brave person to say their job isn’t safe.

Smelling digital garbage

Excel spreadsheets lie at the core of business computing, but what happens when they go wrong?

Excel spreadsheets lie at the core of business computing, but what happens when they go wrong?

James Kwak writing in the Baseline Scenario blog describes how Excel spreadsheets have an important role in the banking industry and their key role in one of the industry’s most embarrassing recent scandals.

In the early days of the personal computer spreadsheets; it was company accountants and bookkeeping clerks who bought the early PCs into offices to help them do their jobs in the late 1980s .

From the accounts department, desktop computers spread through the businesses world and the PC industry took off.

Over time, Microsoft Excel displaced competitors like Excel 1-2-3 and the earliest spreadsheet of all, VisiCalc, and became the industry standard.

With the widespread adoption of Excel and millions of people creating spreadsheets to help do their jobs came a new set of unique business risks.

The weakness with Excel isn’t with the program itself, it’s that the formulas in many spreadsheets aren’t properly tested and often incorrect data is put into the wrong fields.

In his story Kwak cites the JP Morgan spreadsheets that miscalculated the firms Value-At-Risk (VAR) calculations for synthetic derivatives. The result was the London Whale debacle where traders were allowed to take positions – some would call them bets – exposing the bank to huge potential losses.

It turns out that faulty spreadsheets had a key role as traders cut and paste data between various spreadsheets and the formulas that made the calculations had basic errors.

That a bank would have such slapdash procedures is surprising but not shocking, almost every organisation has a similar setup and it gets worse as a project becomes more complex and bigger numbers become involved. The construction industry is particularly bad for this.

Often, a spreadsheet will show out a bunch of numbers which simply aren’t correct. Someone made a mistake entering some data or one of the formulas has an error.

The business risk lies in not picking up those errors, JP Morgan fell for this and probably every business has, thankfully to less disastrous results.

My own personal experience was with a major construction project in Thailand. One sheet of calculations had been missed and the entire budget for lights – not a trivial amount in a 35 storey five star hotel – hadn’t been included in the contractor’s price.

This confirmed in my mind that most competitive construction tenders are won by the contractor who made the most costly errors in calculating their price. Little has convinced me otherwise since.

In the computer industry there’s a saying that “garbage in equals garbage out” which is true. However if the computer program itself is flawed, then good data becomes garbage.

Excel’s real flaw is that it can make impressive looking garbage that appears credible if it isn’t checked and treated with suspicion. The responsibility lies with us to notice the smell when the computer spits out bad figures.

Spreadsheet image courtesy of mmagallan through sxc.hu

Democratising Big Data

Why a not for profit disrupting Google and the Big Data industry is important for business and society

Common Crawl is a not-for-profit web crawler service that makes the data collected open for all to use. A post on the MIT Technology Review blog speculates how the initiative might spawn the next Google.

One of the problems with Big Data is that it’s held mainly by large corporations and government agencies, both of which have the tendency to keep their data private on that basis that information is power and power means money.

We see this in the business models of Facebook, Google and many of Silicon Valley’s startups; the information garnered about users is as, if not more so, valuable as an utility from the product.

Initiatives like Common Crawl tilt the balance somewhat back towards consumers, citizens, and smaller businesses.

How well Common Crawl and other similar initiatives fare remains to be seen – Wikileaks was a good example of how such projects can flare out, collapse under the weight of egos or be harrassed by corporatist interests.

In search, Google are open to disruption as they tweak their results to suit initiatives like Google Plus. During the company’s earnings call earlier this week Larry Page spoke of the challenges of staying focused on the opportunities that matter, it may well be the company is more distracted from its core business than it should be.

Whether Common Crawl disrupts Google is up to history, it could just as well be a couple of kids called Sergei and Larry with a smart idea.

The imperative now though is to try and keep as much public data available for everyone to use and not lock it away for the privileged few. That will let the future Googles develop while making our societies more fairer and open.

Tracking the knowledge graph

Facebook Graph search is powerful and dangerous which means we have to be careful about what we like and who we become friends with

“Married Men Who Like Prostitutes” is juicy search term and the results can wreck marriages, careers and lives.

This is one of the Facebook Graph searches UK tech commentator Tom Scott posted on his Actual Searches on Facebook Tumblr site which lists, mercifully anonymised, the results.

What should worry anybody who uses Facebook is that this data has been in the system all along, advertisers for instance have been able to target their marketing based on exactly this information, Graph Search just makes it quicker and easier to access. This is why you should be careful of what you like and who you friend online.

Tom Scott has a terrific Ignite London presentation which looks at just how vulnerable an individual is by over sharing online. In I know what you did five minutes ago, Tom finds an individual, discovers his mother’s maiden name and phone number all within two minutes.

Facebook isn’t the only service we should be careful of, it just happens to be the one we overshare data with the most. When you start stitching together social media services with government and corporate databases then a pretty comprehensive picture can be made of a person’s likes and preferences.

The best we can hope for in such a society is that picture is accurate, fair and doesn’t cast us in too unfavourable a light.

In same cases though that data can be dangerous, if not fatal.

As potential employers, spouses and the media can easily access this information, it might be worthwhile unliking obnoxious, racist and downright stupid stuff. There’s a very good chance you’ll be asked about them.

Customer lock in as a business asset

Barnes and Noble’s problems show how high the stakes are when locking customers into an online business.

US booksellers Barnes and Noble has been struggling for years and things aren’t getting better reports the New York Times.

An important part of the New York Times story is the quote from a Forrester industry analyst,

“The problem is not whether or not the Nook is good,” said James L. McQuivey, a media analyst for Forrester Research. “What matters is whether you are locked into a Kindle library or an iTunes library or a Nook library. In the end, who holds the content that you value?”

Locking in customers lies at the heart of the Kindle and iTunes business model. Once users have a substantial investment in their book or music collections on one platform it’s unlikely they will go elsewhere as the costs, and risks, of moving are too great.

This doesn’t always end well for the customer and it gives online businesses great power which they often misuse.

Every online business tries to lock their customers into their ecosystem – Google, Amazon, Facebook and Apple are the most successful but every single social media and cloud service tries to make it hard for users take their business elsewhere.

In some respects this is no different to the phone company or bank which have historically tried to lock customers into their services, but the online social media, cloud computing and e-commerce platforms make a much more ambitious grab for their users’ data and assets like music and book collections.

The New York Times article illustrates just how critical that user lock in is to the success of online businesses. The question for us as consumers is how much we want to be locked inside the web’s walled gardens.