Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
    data analytics and gold trading
    Data Analytics and the New Era of Gold Trading
    9 Min Read
    composable analytics
    How Composable Analytics Unlocks Modular Agility for Data Teams
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Tough Analytics? Watson to the Rescue
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Quality > Tough Analytics? Watson to the Rescue
AnalyticsBusiness IntelligenceData QualityUnstructured Data

Tough Analytics? Watson to the Rescue

Barry Devlin
Barry Devlin
9 Min Read
SHARE

“Quickly Watson, get your service revolver!”  Is Watson about to put business intelligence out of its misery?  Is the good doctor about to surpass Sherlock Holmes in his ability to solve life’s enduring mysteries?  Or are we in jeopardy of falling into another artificial intelligence rabbit hole?

“Quickly Watson, get your service revolver!”  Is Watson about to put business intelligence out of its misery?  Is the good doctor about to surpass Sherlock Holmes in his ability to solve life’s enduring mysteries?  Or are we in jeopardy of falling into another artificial intelligence rabbit hole?

Yes, I know.  Although I haven’t found a reference to prove it, I’m pretty sure that IBM Watson, the computer that recently won “Jeopardy!” is named after one of the founding fathers of IBM–Thomas J. Watson Sr. or Jr.–rather than Sherlock Holmes’ sidekick.  But, the questions above remain highly relevant.

IBM Watson is, of course, an interesting beast.  The emphasis in the popular press has been on the physical technology specs–10 refrigerator-sized cabinets containing approximately 3,000 CPU cores, 15 TB of RAM and 500 GB of disk running at about 80 teraflops, and cooled by two industrial air-conditioning units.  But, in comparison to some of today’s “big data” implementations, IBM Watson is pretty insignificant.  eBay, for example, is running up to 20 petabytes of storage.  As of 2010, Facebook’s Hadoop cluster was running on 2300 servers with over 150,000 cores and 64 TB of memory between them.  The world’s current (Chinese) supercomputer champion is running at 2.5 petaflops.

More Read

Value Index Reveals Red Hot Business Intelligence Vendors for 2012
Ben Goertzel’s Report on AGI-09: The Second Conference on…
Decision Tables in JBoss Drools getting an update
A Tale Of Two Banks
IBM’s 2013 Vision Bodes Well for Finance

On the other hand, a perhaps more telling comparison is to the size and energy consumption of the human brain that Watson beat, but certainly did not outclass, in the quiz show!

However, what’s really more interesting from a business intelligence viewpoint is the information stored, the architecture employed and the effort expended in optimizing the processing and population of the information store.  

We know from the type of knowledge needed in Jeopardy! and, indeed, from the possible future applications of the technology discussed by IBM that the raw information input to the system was largely unstructured, or soft information, as I prefer to call it.  During the game, Watson was disconnected from the Internet, so its entire knowledge base was only 500 GB in size.  This suggests the use of some very effective artificial intelligence and learning techniques to condense a much larger natural language information base to a much more compact and usable structure prior to the game.  Over a period of more than four years, IBM researchers developed DeepQA, a massively parallel, probabilistic, evidence-based architecture that enables Watson to extract and structure meaning from standard textbooks, encyclopedias and other documents.  When we recall that the natural language used in such documents contains implicit meaning, is highly contextual, and often ambiguous or imprecise, we can begin to appreciate the scale of the achievement.  A wide variety of AI techniques, such as temporal reasoning, statistical paraphrasing, and geospatial reasoning, were used extensively in this process.

Dr. David Ferrucci, leader of the research project, states that no database of questions and answers was used nor was a formal model of the world created in the project.  However, he does say that structured data and knowledge bases were used as background knowledge for the required natural language processing.  It makes sense to me that such knowledge, previously gathered from human experts, would be needed to contextualize and disambiguate the much larger natural language sources as Watson pre-processed them.  Watson’s success in the game suggests to me that IBM have succeeded in using existing human expertise, probably gathered in previous AI tools, to seed a much larger automated knowledge mining process.  If so, we are on the cusp of an enormous leap in our ability to reliably extract meaning and context from soft information and to use it in ways long envisaged by proponents of artificial intelligence.

What this means for traditional business intelligence is a moot point.  Our focus and experience is directed mainly towards structured, or hard, data.  By definition, such data has already been processed to remove or minimize ambiguity in context or content by creating and maintaining a separate metadata store, as I’ve described elsewhere.  

However, there is no doubt that the major growth area for business intelligence over the coming years is soft information, which, according to IDC is growing at over 60% compound annual growth rate, about three times as fast as hard information, and which already accounts for over 95% of the information stored in enterprises.  It is in this area, I believe, that Watson will make an enormous impact as the technology, already based on the open-source Apache UIMA (Unstructured Information Management Architecture), moves from research to full-fledged production.  There already exists a significant pent-up demand to gain business advantage by mining and analyzing such information.  Progress in releasing the value tied up in soft information has been slowed by a lack of appropriate technology.  That is something that Watson and its successors will certainly change.

While I have focused so far on the knowledge/information aspects of Watson–that being probably the most relevant aspect for BI experts, there is one other key feature of the technology that should be emphasized.  That is Watson’s ability to parse and understand the sort of questions posed in everyday English with all their implicit assumptions and inherent context.  Despite appearances to the contrary in the game show, Watson was not responding to the spoken questions from the quiz master; the computer had no audio input, so the exact same questions were passed to it as text as were heard by the human contestants.  In fact, speech recognition technology has also advanced significantly to the stage where very high levels of accuracy can be achieved.  (As an aside, I use this technology myself extensively and successfully for all my writing…)  The opportunities that this affords in simplifying business users’ communication with computers are immense.

It seems likely that over the next few years this combination of technologies will empower business users to ask the sort of questions that they’ve always dreamed of, and perhaps haven’t even dreamed of yet.  They will gain access, albeit indirectly, to a store of information far in excess of what any human mind can hope to amass a lifetime.  And they will receive answers based directly on the sum total of all that information, seeded by the expertise of renowned authorities in their respective fields and analyzed by highly structured and logic-based methods.

Of course, there is the danger that if a given answer happens to be incorrect, it is difficult to see how the business user would discover that error or be able to figure out why it had been generated.

And that, as Sherlock Holmes never said is far from “Elementary, my dear Watson!”

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

Diverse Research Datasets
The 5 Best Platforms Offering the Most Diverse Research Datasets in 2026
Big Data Exclusive
macro intelligence and ai
How Permutable AI is Advancing Macro Intelligence for Complex Global Markets
Artificial Intelligence Exclusive
warehouse accidents
Data Analytics and the Future of Warehouse Safety
Analytics Commentary Exclusive
stock investing and data analytics
How Data Analytics Supports Smarter Stock Trading Strategies
Analytics Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

A Quick Look Back at Partners 2008

5 Min Read
insurance data
Big Data

How Insurers Evaluate Data and Incorporate it Into their Business Model

6 Min Read

The small impact of business rules on the big players

3 Min Read
cloud computing collaboration
Big DataBusiness IntelligenceCloud ComputingCollaborative DataData Management

Cloud-Based BI Dramatically Improves Collaboration

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?