Big Data has been the talk of the town. The amount of data worldwide is increasing at a rate our human brains struggle to comprehend. Every day, we generate 2.5 quintillion bytes of data (2,500,000,000,000,000,000), and 90% of the world’s data was generated within the last 2 years.
The big question is whether we are sufficiently adept at using the data at hand. The truth in 2015 is that enterprises spend huge sums on structuring and storing data but on average leverage only 20% of that data for analysis and development. How’s it going at your business?
INNOVATION REQUIRES AGILITY
7N consultant and software architect Martin Sponholtz sees considerable potential for development in Danish enterprises when it comes to analyzing data and creating innovation through data science:
Leveraging data science methodology, we assist Danish enterprises in analyzing Big Data in smarter and more innovative ways. In many sectors, we see new companies outmaneuvering large and well-established competitors by using of Big Data.
Sponholtz is not alone in his excitement over the new analytical capabilities offered by Big Data. Gartner Group declared 2015 to be the year for Big Data’s definitive breakthrough. Here’s how Sponholtz explains that Big Data isn’t just old wine in new bottles:
During the last 10 years, organizations have collected ever larger amounts of data. That trend won’t slow in future years. It is makes sense for organizations to focus sharply on business intelligence and get their data warehouses in order. Reducing error rates and producing accurate reports requires robust systems, but the systems used in traditional data warehousing are not agile enough for innovation and development. Enterprises need to learn how to run their IT departments on two tracks if they want to advance. Business intelligence and data warehousing help you find the answers to questions you already have. Big data helps you find answers to questions you didn’t know to ask.
GET IN THE SAND BOX
When Sponholtz and his colleagues arrive for a new assignment, they build a “data sand box” and initiate a search for a specific data analysis case that will generate value for the client. The sand box is named “Hadoop” – as in the logo featuring a small yellow elephant – and is a must in the context of the future of Big Data. Hadoop is an open source platform for highly flexible manipulation of large amounts of data.
We pitch all data into a sand box. That includes the organization’s existing data (including “dark data” not currently used) as well as relevant public data available for purchase. We want to get as much data of various kinds as possible into the sand box in order to take advantage of the fact that Hadoop is able to quickly handle enormous amounts of unstructured data by using a cluster of servers. Scaling is just a matter of adding more servers.
With data science methodology, Sponholtz has found a way to make Big Data accessible for his clients. He sets up and tests a large number of hypotheses.
“Data science gives us a scientific approach to using Hadoop, based on testing a lot of hypotheses very quickly. The method is based on numerous tests and iterations. It’s very agile, and it moves very fast. Suddenly we see a trend, and then we subject it to repeated tests. If it’s promising, we create a specific project and then a model that runs at its own pace. That’s how we get the opportunity to test a lot of ideas we hadn’t even stumbled on yet.”
FUTURE DATA PRODUCTS
Playing with Hadoop in the sand box serves many purposes. One purpose might be adding entirely new dimensions to an organization’s internal and external information supply – a competitive parameter gaining strength at the moment. In Denmark, Vestas is one of the companies that has used Big Data successfully for 10 years to determine the optimal placement of windmills. Another purpose might be testing whether existing data can be used to innovate new data products for expanding a business arena.
“We believe many sectors can use Big Data. One we’re keeping an eye on is the financial sector. Due to regulations, financial institutions sit on a huge amount of data. We’re seeing new initiatives in data analysis addressing credit assessments in real time. Internal enterprise data is supplemented by publicly available data and social media data. We see applications in fraud, where we can accelerate detection of, say, credit card fraud and enable a swift response.”
THE RIDE IS GETTING WILDER
The future will produce even greater volumes of data. Just think of the growing number of devices feeding data to the internet. It’s getting ever more hectic – as illustrated by Apple’s launch of iWatch. The graphic below shows, out to 2020, the explosion in the number of devices hooked up to the internet: No fewer than 50.1 billion devices are expected to be connected. Sponholtz concludes:
We must be alert to three parameters with respect to the future and Big Data: Volume, variety, and velocity – the three Vs. Hadoop helps enterprises handle the challenge of the three Vs. I concentrate on simple structures, speedy results, and data product innovation. It’s no longer all about collecting data into sophisticated structures – now, it’s about analysis and creation of new value. We must emerge from the cumbersome IT department and let ideas take flight.