Dealing with the biggest of data

The CERN research project generates huge amounts of data however the human touch is needed to analyse and manage the information

How do you deal with the biggest data sets of all? Bob Jones, a project leader for the European Organization for Nuclear Research – commonly known as CERN – described how the world’s largest particle physics laboratory manages 100 petabytes of data.

The first step is not to collect everything, ““We can’t keep all the data, the key is knowing what to keep” says Jones. This is understandable given the cameras capturing the collisions have 150 million sensors delivering data at 40 million times per second.

Jones was speaking at the ADMA Global Conference’s Advancing Analytics stream where he was describing how the project manages and analyses the vast amounts of data generated by the huge projects.

Adding to Jones’ task and that facing CERN’s boffins is that data has to be preserved and verifiable so scientists can review the results of experiments.

Discovering the Higgs Boson for instance required finding 400 positive results out of 600,000,000,000,000,000 events. This requires massive processing and storage power.

Part of the solution is to have a chain of data centres across the world to carry out both the analytics and data storage supplemented by tape archiving, something that creates other issues..

“Tape is a magnetic medium which means it deteriorates over time.” Jones says, “we have to repack this data every two years.”

Another advantage with a two year refresh is this allows CERN to apply the latest advances in data storage to pack more data into the medium.

CERN itself is funded by its 21 member states – Pakistan is its latest member – which contribute its $1.5 billion annual budget and the organisation provides data and processing power to other multinational projects like the European Space Agency and to private sector partners.

For the private sector, CERNs computing power gives the opportunity to do in depth analytics of large data sets while the unique hardware and software requirements mean the project is a proving ground for high performance equipment.

Despite the high tech, Jones says the real smarts behind CERN and the large Hadron Collider lie in the people. “All of the people analysing the data are trained physicists with detailed, multi year domain knowledge.”

“The reason being is the experiment and the technology changes so quickly, it’s not written down. It’s in the heads of those people.”

In some respects this is comforting for those of us worrying about the machines taking over.

Similar posts:

  • No Related Posts

Towards the zero defect economy

The Internet of Things promises to eliminate defects which is good news for most, but not all, industries

At 2.03 in the morning of July 11, 2012, a Norfolk Southern Railway Company freight train derailed just inside the city limits of Columbus, Ohio.

The resulting crash and fire caused over a hundred people to be evacuated, resulted in over a million dollars in damages and created massive disruption throughout the US rail network.

Could accidents like this be avoided by the Internet of Things? Sham Chotai, the Chief Technical Officer of GE Software, believes applying sensor technology to locomotives can detect conditions like defective rails and save US railway operators around a billion dollars a year in costs.

“We decided to put the technology directly on the locomotive,” says Chotai in describing the problem facing railroad operators in scheduling track inspections. “We found we were mapping the entire railway network, and we were mapping anything that touched the track such as insulated joins and wayside equipment.”

This improvement in reliability and its benefits to business is something flagged by then Salesforce Vice President Peter Coffee in an interview with Decoding the New Economy in 2013.

“You can proactively reach out to a customer and say ‘you probably haven’t noticed anything but we’d like to come around and do a little calibration on your device any time in the next three days at your convenience.'”

“That’s not service, that’s customer care. That’s positive brand equity creation,” Coffee says.

Reducing defects isn’t just good for brands, it also promises to save lives as Cisco illustrated at an Australian event focused on road safety.

Transport for New South Wales engineer John Wall explained how smarter car technologies, intelligent user interfaces and roadside communications all bring the potential of dramatically reducing, if not eliminating, the road toll.

Should it turn out the IoT can radically reduce defects and accidents it won’t be good news for all industries as John Rice, GE’s Global Head of Operations, pointed out last year in observing how intelligent machines will eliminate the break-fix model of business.

“We grew up in companies with a break fix mentality,” Rice says. “We sold you equipment and if it broke, you paid us more money to come and fix it.”

“Your dilemma was our profit opportunity,” Rice pointed out. Now, he says engineering industry shares risks with their customers and the break-fix business is no longer the profit centre it was.

A zero defect economy is good news for customers and people, but for suppliers and service industries based upon fixing problems it means a massive change to business.

Similar posts:

  • No Related Posts

Literacy in old and new terms

Is data literacy as important today as being able to read and write was a century ago?

I’m in Wellington, the capital of New Zealand, for the next few days for the Open Source, Open Society conference.

During one of the welcome events Lillian Grace of Wiki New Zealand mentioned how today we’re at the same stage with data literacy that we were two hundred years ago with written literacy.

If anything that’s optimistic. According to a wonderful post on Our World In Data, in 1815 the British literacy rate was 54%.

world-literacy-rates

That low rate makes sense as most occupations didn’t need literate workers while a hundred years later industrial economies needed employees who could read and write.

Another notable point is the Netherlands has led the world in literacy rates for nearly four hundred years. This is consistent with the needs of a mercantile economy.

Which leads us to today’s economy. In four hundred years time will our descendants  be commenting on the lack of data literacy at the beginning of the Twenty-First Century?

 

Similar posts:

  • No Related Posts

Big sports data – how tech is changing the playing field

The internet of things is dramatically changing the world of sports

“When you’re playing, it’s all about the winning but when you retire you realise there’s a lot more to the game,” says former cricketer Adam Gilchrist.

Gilchrist was speaking at an event organised by software giant SAP ahead of a Cricket World Cup quarter final at the Melbourne Cricket Ground yesterday.

SAP were using their sponsorship of the event to demonstrate their big data analytics capabilities and how they are applied to sports and the internet of things.

Like most industries, the sports world is being radically affected by digitalisation as new technologies change everything from coaching and player welfare through to stadium management and fans’ experience.

Enhancing the fan experience

Two days earlier rival Melbourne stadium Etihad in the city’s Docklands district showed off their new connected ground where spectators will get hi-definition video and internet services through a partnership between Telstra and Cisco.

While Etihad’s demonstration was specifically about ‘fan experience’, the use of the internet of things and pervasive wireless access in a stadium can range from paperless ticketing to managing the food and drink franchises.

In the United States, the leader in rolling out connected stadiums, venues are increasingly rolling out beacon technologies allowing spectators to order deliveries to their seats and push special offers during the game.

While neither of the two major Melbourne stadiums offer beacon services at present, the Cisco devices around the Etihad have the facility to add Bluetooth capabilities when the ground managements decide to roll them out.

Looking after players

Probably the greatest impact of technology in sport is with player welfare; while coaches and clubs have been enthusiastic adopters of video and tracking technologies for two decades, the rate of change is accelerating as wearable devices are changing game day tactics and how injuries are managed.

One of the companies leading this has been Melbourne business Catapult Sports which has been placing tracking devices on Australian Rules football players and other codes for a decade.

For coaches this data has been a boon as it’s allowed staff to monitor on field performance and tightly manage players’ health and fitness.

Professional sports in general have been early adopters of new technologies as a small increase in performance can have immediate and lucrative benefits on the field. Over the last thirty years clubs have adopted the latest in video and data technology to help coaches and players.

As the technology develops this adoption is accelerating, administrators are looking at placing tracking devices within the balls, goals and boundary lines to give even more information about what’s happening on the field.

Managing the data flow

The challenge for sports organisations, as with every other industry, is in managing all the data being generated.

In sports managing that data has a number of unique imperatives; gamblers getting access to sensitive data, broadcast rights holders wanting access to game statistics and stadium managers gathering their own data all raise challenges for administrators.

There’s also the question of who owns the data; the players themselves have a claim to their own personal performance data and there could potentially be conflicts when a competitor transfers between clubs.

As the sports industry explores the limits of what they can do with data, the world is changing for players, coaches, administrators and supporters.

Gilchrist’s observation that there’s a lot more to professional sports than just what happens on the field is going to become even more true as data science assumes an even greater role in the management of teams, clubs and stadiums.

Paul travelled to Melbourne as a guest of Cisco and SAP.

Similar posts:

  • No Related Posts

The high cost of distrust

A lack of trust in data is going to cost the world’s economy over a trillion dollars forecast a Cisco panel

A lack of trust in technology’s security could be costing the global economy over a trillion dollars a panel at the Australian Cisco Live in Melbourne heard yesterday.

The panel “how do we create trust?” featured some of Cisco’s executives including John Stewart, the company’s Security and Trust lead, along with Mike Burgess, Telstra’s Chief Information Security Officer and Gary Blair, the CEO of the Australian Cyber Security Research Institute.

Blair sees trust in technology being split into two aspects; “do I as an individual trust an organisation to keep my data secure; safe from harm, safe from breaches and so forth?” He asks, “the second is will they be transparent in using my data and will I have control of my data.”

In turn Stewart sees security as being a big data problem rather than rules, patches and security software; “data driven security is the way forward.” He states, “we are constantly studying data to find out what our current risk profile is, what situations are we facing and what hacks we are facing.”

This was the thrust of last year’s Splunk conference where the CISO of NASDAQ, Mark Graff, described how data analytics were now the front line of information security as threats are so diverse and systems so complex that it’s necessary to watch for abnormal activity rather than try to build fortresses.

The stakes are high for both individual businesses and the economy as technology is now embedded in almost every activity.

“If you suddenly lack confidence in going to online sites, what would happen?” Asks Stewart. “You start using the phone, you go into the bank branch to check your account.”

“We have to get many of these things correct, because going backwards takes us to a place where we don’t know how to get back to.”

Gary Blair described how the Boston Consulting Group forecast digital economy would be worth between 1.5 and 2.5 trillion dollars across the G20 economies by 2016.

“The difference between the two numbers was trust. That’s how large a problem is in economic terms.”

As we move into the internet of things, that trust is going to extend to the integrity of the sensors telling us the state of our crops, transport and energy systems.

The stakes are only going to get higher and the issues more complex which in turn is going to demand well designed robust systems to retain the trust of businesses and users.

Similar posts:

Clawing back our data – Telstra makes metadata available to customers

Australia’s Telstra responds to government data legislation by opening metadata to users

Today Australian incumbent telco announced a scheme to give customers access to their personal metadata being stored by the company.

In a post on the company’s Telstra Exchange blog the company’s Chief Risk Officer, Kate Hughes described how the service will work with a standard enquiry being free through the web portal with more complex queries attracting of fee of $25 or more.

The program is a response to the Australian Parliament’s controversial intention to introduce a mandatory data retention regime which will force telcos and ISPs to retain a record of customer’s connection information.

We believe that if the police can ask for information relating to you, you should be able to as well.

At present the scheme is quite labor intensive, a request for information involves a great deal of manual processing under the company’s current systems however Hughes is optimistic they will be able to deal with the workload.

“We haven’t yet built the system that will enable us to quickly get that data,” Hughes told this website in an interview after the announcement. “If you came to us today and asked for that dataset it wouldn’t be a simple request.”

The metadata opportunity

In some respects the metadata proposal is an opportunity for the company to comply with the requirement of the Australian Privacy Principles that were introduced last year where companies are obliged to disclose to their customers any personally identifiable information they hold.

For large organisations like Telstra this presents a problem as it’s difficult to know exactly what information every arm of the business has been collecting. Putting the data into a centralised web portal makes it easier to manage the requirements of various acts.

That Telstra is struggling with this task illustrates the problems the data retention proposals present to smaller companies with far fewer resources to gather, store and manage the information.

Unclear requirements

Another problem facing Hughes, Telstra and the entire Australian communications industry is no-one is quite clear exactly what data will be required under the act, the legislation proposed the minister can declare what information should be retained while the industry believes this should be hard coded into the act which will make it harder for governments to expand their powers.

What is clear is that regardless of what’s passed into law, technology is going to stay ahead of the legislators, “I do think though this will be very much a ‘point in time’ debate,” Hughes said. “Metadata will evolve more quickly than this legislation can probably keep pace with so I think we will find ourselves back here in two years.”

In many ways Australia’s metadata proposals illustrates the problems facing governments and businesses in managing data during an era where its growing exponentially, it may well turn out for telcos, consumers and government agencies that ultimately less is more.

Similar posts:

Reducing big data risks by collecting less

Just because you can collect data doesn’t mean you should

“To my knowledge we have had no data breaches,” stated Tim Morris at the Tech Leaders conference in the Blue Mountains west of Sydney on Sunday.

Morris, the Australian Federal Police force’s Assistant Commissioner for High Tech Crime Operations, was explaining the controversial data retention bill currently before the nation’s Parliament which will require telecommunications companies to keep customers’  connection details – considered to be ‘metadata’ – for two years.

The bill is fiercely opposed by Australia’s tech community, including this writer, as it’s an expensive   and unnecessary invasion of privacy that will do little to protect the community but expose ordinary citizens to a wide range of risks.

One of those risks is that of the data stores being hacked, a threat that Morris downplayed with some qualifications.

As we’re seeing in the Snowden revelations, there are few organisations that are secure against determined criminals and the Australian Federal Police are no exception.

For all organisations, not just government agencies, the question about data should be ‘do we need this?’

In a time of ‘Big Data’ where it’s possible to collect and store massive amounts of information, it’s tempting to become a data hoarder which exposes managers to various risks, not the least that of it being stolen my hackers. It may well be that reducing those risks simply means collecting less data.

Certainly in Australia, the data retention act will only create more headaches and risks while doing little to help public safety agencies to do their job. Just because you can collect data doesn’t mean you should.

Similar posts: