Not Only SQL, Not Only Big Data

Big Rubbish PIle on Car.jpgAttending the Teradata Universe 2012 in Dublin, an impressive line-up of speakers from Tim Berners-Lee to customers doing

Big Rubbish PIle on Car.jpgAttending the Teradata Universe 2012 in Dublin, an impressive line-up of speakers from Tim Berners-Lee to customers doing real data warehouse implementations got me thinking beyond the normal boundaries about our assumptions about the real role and value of data – both traditional and big.  A few observations follow, but first…

As an ex-pat Irishman, I have to say that the new Convention Centre Dublin is a wonderful venue for events with up to a couple of thousand attendees.  The main auditorium is a superb space and there’s lots of room for expo and breakouts.  And the facilities and staff are first rate.  Well done!  My only regret is that the area around the Centre, especially towards the Port, remains blighted by vacant sites and unfinished blocks – the legacy of Ireland’s boom and bust – but not much can be done about that for now.

Much of the main tent focus at this year’s event was on the future of information, with big data featuring… well… large in the presentations of speakers such as Erik Brynjolfsson, Professor and Director of the MIT Center for Digital Business and Sir Tim Berners-Lee, inventor of the World Wide Web.  Michio Kaku, Professor of Theoretical Physics at City College of New York, also addressed the theme of the central role of data in every aspect of our future.  The tone of these presentations is best described as expansive and optimistic – given better and more data and technology, the future of business and humankind in general is rosy.  This is an expectation that I, personally, believe to be of somewhat low probability.

While I am a long-time supporter of the need for and value of good and extensive information in business, my experience of the purposes for which such information is used and the extent to which decision making benefits is less sanguine.  In general business, business intelligence is used almost exclusively in support of a narrowly-focused drive for bottom-line profit.  At the risk of being labeled a Communist, I remain unconvinced that this is always a good thing.

This niggling doubt is best expressed through an example – the use of data warehousing in retail, something that has been going on for over 25 years.  BI can be very effective in optimizing the supply chain from manufacturer all the way to customer, supporting the intent of the business to reduce cost.  When that focus is pursued as a sole strategy, it can have highly undesirable effects, through driving local suppliers out of business, reducing a community’s disposable income and creating an unbreakable downward economic spiral.  As a BI community we can say that BI is not responsible, and on the level of cause and effect, that’s true.  But, at a deeper level, we cannot ignore the side effects of the tools and techniques we invent and promote, any more than cigarette manufacturers can avoid responsibility for the impact of passive smoking.

Getting back to big data, the problem is that as we focus on, and get excited about, a technique such as statistical analysis of social behavior to predict marketing trends for a brand, for example, we simultaneously narrow our focus on potentially interesting or important information that is external to that data.  Big data encourages us to somewhat obsessively analyze in ever greater depth the minutiae of life.  Why?  Often to drive profit for some business.  The optimistic view I mentioned earlier imagines that we will use this data to solve medical issues, world hunger, climate change, and more.  I don’t have data to confirm this, but I guess that the proportion of profit-driven big data analytics vs. altruistic is greater than 10 to 1.  And which of these two categories of information have the highest impact on the medium- and long-term survival of humanity?  The last speaker, Deb Roy, CEO of Bluefin Labs, showed us just how much analysis can be done to link social network activity to TV shows and advertising.  All to decide where to spend millions of advertizing dollars.  There two ways of looking at this: (1) everybody needs to do this type of processing in order to compete, or (2) we need to examine our underlying model of doing business that drives such net-non-productive activity.  I would invite you to share your views on this.

At a more mundane and practical level, speakers from current Teradata customers focused in a very different area – creating consistent and integrated enterprise data warehouses for very traditional transaction business data.  Unsurprisingly, the majority of enterprises are still struggling with the old issues that drove data warehouse development for the past 30 years.  I have no doubt that this will continue for most businesses for many more years.  

But, while this continues, we need to start thinking about the more philosophical issues that the conference brought up for me.