Big data will completely transform the way companies, governments and individuals use information to inform decision-making and actions. There is a huge hype around the term ‘big data’ and in my inaugural post of my new ‘The Big Data Guru Colum’ I went back to basics outlining what big data really is and why it will change the world. In my first post I was saying that we are seeing an exponential rise of the amount of data the world is generating and that we now have the ability to harness these vast amounts of data to gain new insights for business, government, science and research. In fact, every part of life and society will see an impact of big data.
So, why is big data is not about big or data?
The point I want to make in my second post is that the ‘big data revolution’ is not about the amount of data or about amassing as much data as possible. We have always (or at least for a very long time) had large and complex sets of data. Business, science, research and governments have stored and analyzed very large sets of data for many decades. At the same time advanced analytics are not new: Companies like Wal-Mart, agencies like the CIA, and universities and research labs have analyzed extremely large datasets for many years.
What makes it different today is that we can now do so much more with this data. It is not the amount of data that is making a difference but the ability to analyze vast and complex data sets beyond the limitations of our traditional database technology. It allows us to turn big data into useful insights and actions. This is why I prefer to talk about ‘big data analytics’ rather than ‘big data’. But anyway, let’s look at the things that enable this new ability to analyze big data.
The enablers of big data analytics
There are a number of factors that increasingly enable us to analyze big data, the 3 key ones are:
- Cheap and distributed storage of data – where data is physically stored at different locations (including cloud storage) at low costs and connected via networks.
- Increased network speed – where data can now be transferred and analyzed across networks at very high speeds
- New techniques and software to manage and analyze large data across distributed systems – we have seen the emergence of innovations such as Hadoop, MapReduce and Big Table that are transforming our ability to analyze large and complex data sets without the need for expensive super-computers. Instead, a bunch of much cheaper commodity servers can be used to run the analysis.
Without going into too much technical detail it basically allows us to analyze large volumes of both:
- structured data (the data that we can neatly put into rows and columns of our databases such as orders, financial transactions, stock data, etc.), and
- unstructured data (the data we can’t easily store and index in traditional databases such as email content, social media posts, video content, photos, voice recordings, sounds, etc.)
What’s more, the analysis can now be performed without the need to purchase or build large proprietary systems or supercomputers. It therefore means that any business or government body, large or small, (or indeed anyone) now has the ability to use large and complex data to better inform their decision-making. Many are starting to use big data analytics to complement their traditional data analysis in order to get richer and improved insights and smarter decisions.
So, how does big data analytics help us become smarter?
We can now analyze large volumes data from different data sources to gain insights that were never possible before. Let’s look at some practical examples of how big data analytics is helping to make our world smarter.
There are so many examples of how businesses are using big data to become smarter. Take Wal-Mart, who is now able to take data from past buying patterns, their internal stock information, mobile phone location data, social media as well as external weather information and analyze all of this in seconds so it can send someone a voucher for a BBQ cleaner to their smart phone – but only if that person owns a barbeque, the weather is nice and he or she is currently within a 3 miles radius of a Wal-Mart store that has the BBQ cleaner in stock.
Another client of mine, a leading telecom company, has developed big data analytics models to predict customer satisfaction and potential customer churn. Based on phone and text patterns as well as social media analytics the company was able to classify customers into different categories. The analytics showed that people in one specific customer category were much more likely to cancel their contract and move to a competitor. This is extremely useful information that now helps the telecom company closely monitor the satisfaction levels of these clients and prioritize preventative actions.
Here is another example from the world of sport where big data analytics is increasingly used to improve the performance of athletes. The Olympic cycling team in the UK uses bikes that are fitted with sensors on their pedals that collect data on how much acceleration every push on the pedal generates. This allows the team to analyze the performance of every cyclist in every race and every single training session. In addition, the team has started to integrate data from wearable devices (like smart watches) athletes wear on their wrist. These devices collect data on calorie intake, sleep quality, air quality, exercise levels, etc. The latest innovation now is to integrate analysis of social media posts to better understand the emotional states of athletes and how this might impact track performance.
Big data analytics are currently completely transforming healthcare. One example is a hospital unit that looks after premature and sick babies. It is now applying real time analytics based on a recording of every breath and every heartbeat of all babies in their unit. It then analyses the data to identify patterns. Based on the analysis the system can now predict infections 24hrs before the baby shows any visible symptoms. This allows early intervention and treatment that is so vital in fragile babies.
Love is an important element of human happiness and I guess we all want to find our soul mate. But how do we find the right one? Even here big data analytics can help. Take dating site eHarmony. Its founder studied thousands of married couples and based on the findings created a predictive analytics model that takes into account twenty-nine different variables relating to different personality traits, behaviors and social skills. Each person who signs up for the site has to complete a comprehensive profile questionnaire, which will then provide the data for the analytics model to find you a match. This way eHarmony is able to match you with someone that might not fall into your usual dating pattern but where the data suggests a good match. Other match-making sites use different analytical models. Take Perfectmatch.com as another example, their analytics model looks for ‘complementary’ personality traits. Many of the online dating sites are now looking at integrating data from social media networks into their models.
Many cities are now using big data analytics to e.g. analyze and predict congestion levels on their road and transport networks (among many other things). For example, camera data, traffic updates, whether information, train and bus location data, as well as Twitter messages and Facebook updates are all analyzed to get a realistic real time understanding of traffic levels. This can then be used to e.g. re-route busses and trains, increase or decrease the frequency of public transport on specific routes, re-route and optimize traffic flows, etc.
There are so many other examples I could list. Basically, any part of our lives will soon be affected by the analysis of big data.
Summary and next post
The buzz and hype around big data is not about the large volumes of data per se (even though the increasing datafication of the world is helping) but our ability to analyze this data and make it useful.
Finally, no discussion about analyzing big data, especially our private data such as social media posts, credit card records, email and phone conversations, can be complete without mentioning the increasing concerns about privacy. The privacy debate around big data analytics gathered momentum with the revelations by Edward Snowden on how the U.S. National Security Agency (NSA) collects and analyses big data including the phone records and social media activities of millions of Americans. In my next post I will address the privacy issues around big data. In the meantime, please ensure you follow me to make sure that you receive the future posts in my Big Data Guru column and feel free to also connect via Twitter, Facebook and The Advanced Performance Institute