BI like Google

I was one of the speakers at the International Data Warehousing and Business Intelligence Summit 2011, New and Emerging Technologies in Rome last week. A host of excellent speakers, including Colin White, Claudia Imhoff, Cindy Howson, James Taylor, not to mention my modest self, covered a wide range of topics that are shaping the future outlook for BI. The event has been running for many years now and this year an additi

The main theme of this blog, however, was sparked by a very interesting conversation with a small, English startup, that was exhibiting at the event, called Neutrino Concepts. Founded in 2007, the company aims to deliver, according to themselves, “next generation business intelligence software based on technology designed to enhance decision-making across all levels; allowing organisations to become more agile and efficient, to gain faster and exponential return on investment.” The bottom line is encapsulated in a single concept: “Google-like”. Wayne Eckerson sumes it up: “Neutrino Concept’s interface enables users to search existing data warehouses using words and phrases – similar to Google – instead of submitting complicated queries.”

The concept is hardly new. Natural-language search and artificial intelligence have been around for decades. More recently, Google’s legendary, minimalistic interface has been held up for years to BI tool vendors and developers as the ultimate goal in usability. Just type in a few keywords or phrases and the system will auto-magically find the results you’re looking for. The reality behind the marketing hype is somewhat different. BI, its users and its context differs dramatically from those of Google. Google deals mainly with documents – soft information, BI with highly structured hard information. Google depends on the “wisdom of crowds”, very large crowds indeed and statistical analysis of their behaviors to determine what is likely to be important in a particular search; BI lacks crowds of users and extensive cross-referencing (hyperlinks) to make similar inferences. So, how can BI become Google-like?

Until now, much of the thinking in this area has come from vendors from the content space, who seek to extend their inverted indexing approach from documents to relational databases. This approach is essentially post-cognitive – meaning and relationship is discovered after the information has been created and ingested (as content vendors say) into the system. This, of course, is the only viable approach where documents can contain any information, in any context and in any structure. This condition is not true for traditional business intelligence information, where structure and meaning are defined in advance, essentially a pre-cognitive approach. Pat Foody, Technical Director of Neutrino Concepts, clearly comes at the problem from this latter approach, based on extensive experience in, dare I say it, traditional hard data.

Neutrino Concepts’ product, NIRA (the poetically named Neutrino Information Release Appliance!) demonstrates impressive ability to interpret free-form user input and return highly relevant sets of data the demonstration hard data set. Such hard data can come from the warehouse, but also from user-defined sources such as CSV files and spreadsheets. Soft data can be included in the search too, where required. An intuitive user interface allows easy filtering, querying and joining of results to produce the required answers. The success of this approach depends entirely on the quantity, breadth and quality of the metadata describing the data sources. Where such sources are part of the enterprise data warehouse environment, it may be expected that such metadata exists. Unfortunately, such metadata is often limited. For user-defined sources, such metadata is likely to e missing. NIRA, in its present incarnation, takes the pessimistic view: assuming that such metadata is unavailable, it requires that it be entered during the setup phase. This approach is probably more than sufficient for small projects; for larger, enterprise-wide projects it will be unlikely to suffice.

NIRA shows how BI can begin to approach “Google Nirvana” within the context of largely well-defined and well-documented sets of traditional business data. One needs to recognize, however, that there is more work to be done to enable automatic extraction of metadata, both business and technological, from existing metadata stores and similar repositories.