big data and analytic trends 2015

Happy New Year! Here’s a list of what I find are the most interesting trends in analytics in 2015.

Why the italics? because most of what will happen this year can be summarized with a single word: more. Yes, there will be more data, more mobile analytics, more cloud analytics, more data discovery, more visualization, etc.—but these are the trends that I personally will be paying closer attention to over the year:

More Magic

Arthur C. Clark famously said that “Any sufficiently advanced technology is indistinguishable from magic.” The analytics industry has recently seen big advances in technology, but it hasn’t yet turned into magic—tools and interfaces that “just work.”

Today, people are required to shepherd every step of the analytics process, determining what data is available, how it should be joined, how it should stored, and how it should be analyzed and visualized.

But the new power of advanced analytics and machine learning is now being applied to the process of analytics itself—so that more of the process can be automated.

We should be able to point our tools at the data, and let the algorithms figure out it how it should be joined and cleansed, propose complementary data, and optimize how it should be stored (e.g. between cost-effective “cold” storage and operations-optimized “hot” storage). We should be able to let our tools identify outliers, find statistically-valid correlations, and propose the right types of visualization.

Today, companies like SAP offer Smart Data Access to connect data seamlessly between Hadoop/Spark and new in-memory analytics systems. And the SAP Lumira data discovery tool uses advanced statistics to automatically generate Related Visualizations based on the data sets being viewed. 2015 will see more advanced automation based on these capabilities.

Datafication

Datafication is what happens when technology reveals previously invisible processes—which can then be tracked and optimized. This isn’t a new trend, but it’s gathering speed as real-time operational analytics systems become available and the price of gathering data continues to plummet.

Connected devices are the highlight of this year’s CES conference. Beyond the dozens of fitness tracking devices already available, there are now chips that stop you slouching, and sensor-enabled soccer ballsbasketballs and tennis rackets to help you improve your game. Sensor tags can even help you find your keys.

The key insight is that even simple data can lead to big insights. For example, sensor-equipped carpets promise to help seniors stay independent longer—not because the sensors themselves are complex, but because powerful pattern-detection algorithms can learn a resident’s normal gait and sound an alert if it starts to deteriorate. And who would have thought that fitness devices could locate the epicenter of an earthquake?!

And of course all this applies to commercial uses. Shoppers can be tracked with beacons, Inventory can be tracked via drones. You can spot process bottlenecksoptimize beer sales, and track real-time purchases.

Here’s an instant business opportunity for 2015: find a process that is poorly tracked. Install simple sensors along the process and feed the collected real-time data to the cloud. Then use sophisticated analytics to feed actionable insights back to business people using mobile interfaces. For bonus points, add complementary third-party data sets, offer industry benchmarking, and encourage community best-practice sharing.

Multipolar Analytics

The layer-cake best-practice model of analytics (operational systems and external data feeding data marts and a data warehouse, with BI tools as the cherry on the top) is rapidly becoming obsolete.

It’s being replaced by a new, multi-polar model where data is collected and analyzed in multiple places, according to the type of data and analysis required:

  • New HTAP systems (traditional operational data and real-time analytics)
  • Traditional data warehouses (finance, budgets, corporate KPIs, etc.)
  • Hadoop/Spark (sensor and polystructured data, long-term storage and analysis)
  • Standalone BI systems (personal and departmental analytics, including spreadsheets)

There are clear overlaps with each of these systems, and they will converge over time, but each is a powerful hub that is not going to be replaced by the others any time soon.

In 2015 we will see the development of more best-practice guidance for how to get the most out of this pragmatic—but complex—collection of analysis hubs. This will involve both regular data feeds between poles and federated analysis to provide a connected view across the enterprise (including, hopefully, some more “magic”—see point 1).

Questions that enterprise architects will have to answer for different uses include:

  • Where will this data arrive first?
  • Will it need to be move to another pole as part of an analysis? When and why?
  • Where and when will the data be modeled, and by whom?
  • What are the different levels of access that will be given to different users, with what governance?

Fluid Analysis

Analytic infrastructures have been too brittle. With the right setup, they have provided powerful, flexible analytics—but implementing systems takes too long and it has been a challenge to keep up with the changing needs of the organization.

The latest analytics technologies allow for fluid analytics that adapt more gracefully to changing needs, with better support for one-off analysis and the analytics lifecycle:

  • Rather than having to define a schema/business model upfront, Hadoop allows schema on read queries that combine the data as and when necessary. With the right skills, business users (or more likely data scientists) can ask any question that can be answered by the available data, making unplanned or one-off analyses faster and more cost-effective.
  • In-memory HTAP systems allow powerful analysis directly on detailed operational data, and the analytics schema is defined in metadata. This means it can be updated without having to physically create new tables. For example, an in-memory finance system allows you to quickly and easily view the consequences of a new regional structure on your accounts—without having to move any data.
  • Governed data discovery systems make it easier to manage the typical lifecycle of new types of analysis, for example by allowing successful personal or departmental analytics to be identified and industrialized for enterprise-wide use.

Community

Analytics is no longer under the control of well-meaning central IT dictatorships. As decisions about IT spending increasingly move to business units, analytics projects have to have the consent of the governed—and this means big changes to every aspect of analytics organization.

2015 will see the further development of the community governance of analytics. Analytics leads will have to develop the skills they need to build and nurture internal social networks that will set priorities and put pressure on maverick BI projects. To do this, they have to behave more like politicians, paying closer attention to the needs of their electorate and cajoling everybody to play their part for the good of the community as a whole.

Analytic Ecosystems

In a logical extension of datafication within an organization, 2015 will see more analytics across business networks, helping optimize processes between the participants of an ecosystem. Some examples include:

  • The Smart Port Logistics platform created by the Hamburg Port Authority. It is designed to connect all of the participants of the port, including the shipping companies, trucking companies, customs officials, and even the truck car parks and retail outlets. By collecting, analyzing, and feeding back information in real time, the Port Authority helps all the participants become more efficient.
  • The cooperation of Volkswagen, Shell, and SAP on a connected car ecosystem.
  • The largest business network, Ariba, is offering sophisticated predictive analytics to give insights across connected processes including early warnings of potential supply chain disruption.

Data Privacy

Data privacy laws and processes are now lag far behind the power of available technology. Serious abuses have already come to light and there are probably many others that haven’t yet been revealed.

2015 will see some welcome advances in the default use of encryption, but more sweeping changes are required to control how people combine and access personal data sets. Ultimately this is a problem than can only be fixed by society, laws, and cultural changes—and unfortunately, those changes will probably only come about after much pain and suffering.

An equivalent analogy might be the use of asbestos in construction. Because it had many useful qualities, including affordability, sound absorption, and resistance to fire, it was widely used around the world, despite concerns over toxicity.  The asbestos industry and governments played down the dangers until the deadly consequences could no longer be denied. Government intervention came only after many people had suffered—and the US still lags behind other developed countries that have banned its use. The new controls mean that changes to existing buildings can be very expensive.

As you’re building your big data solutions, make sure you do it with proper data controls in place, and don’t abuse people’s expectations of how their data will be used, whether or not you have a legal right to do so today. Making the right choices today will help you avoid social risk and/or expensive changes in the future.

Conclusion

In their recent book, the Second Machine Age, authors Erik Brynjolfsson and Andrew McAfee argue that we are now in the “second half of the chessboard” when it comes to computer technology. The exponential trend means the increases in data processing power this year will be the equivalent of decades of progress in the past.

Nowhere is this clearer than in the area of analytics, where the biggest problem is increasingly that organizations just don’t know which of the myriad business opportunities to implement first.

2015 will be a wonderful year for analytics, just like it has been for the last quarter-century—as long as we remember that great power brings great responsibility, and that we must also strive to adapt our information culture and processes.