Watson Analytics: The Data Scientist Accelerator

September 19, 2014
310 Views

On Tuesday, September 16, at Times Square New York, IBM announced the forthcoming release of its Watson Analytics platform. This freemium SAAS offering is positioned to be a *true* game-changer in the fields of big data analytics, business intelligence, and data science. As a pure SAAS solution, it requires no bulky installs onto your PC. It’s touted as a “data scientist accelerator, not a data scientist replacement”.

On Tuesday, September 16, at Times Square New York, IBM announced the forthcoming release of its Watson Analytics platform. This freemium SAAS offering is positioned to be a *true* game-changer in the fields of big data analytics, business intelligence, and data science. As a pure SAAS solution, it requires no bulky installs onto your PC. It’s touted as a “data scientist accelerator, not a data scientist replacement”.

As a data scientist myself, I am super excited to see that there will soon be a single tool I can turn to for help in quickly complete all data science processes – from data clean-up, to data modeling, and even model evaluation and data visualization.

It doesn’t replace the data scientist, Watson Analytics frees the data scientist from boring, tedious tasks so that he or she has more time to innovate.

Here’s how Watson Analytics works… You upload your data to the platform for free and then allow the platform to suggest changes and otherwise help you clean your dataset. Once the data is clean, you can ask Watson Analytics a question about that data and it will begin working its magic to get you quick and concise answers. As part of that “magic”, Watson Analytics accesses and runs all compatible algorithms that comprise the SPSS Modeler workbench. It then scores those models, returns results, produces visualizations, and allows you to explore and interact to get a deeper understanding of your data insights. Let me explain in a little more detail exactly how all of this works…

 b2ap3_thumbnail_expert-engineers.jpg

IBM Distinguished Engineers: Robin Grosset, Dan Wolfson, Jing Shyr, and Greg Adams (left to right)
 

Watson Analytics uses machine learning to reduce your data clean-up times

Watson Analytics tracks and greatly simplifies the process of making changes to your data during data clean-up. As you continue using the tool over time, it learns more and more about your data and about the changes you typically make. As the tool learns, it starts making those changes and updates for you – thus, over time, the tool learns to automate your data cleanup tasks and clean your data for you!! To safeguard this process, the tool documents and archives every change that is made to your data so that you can go in and undo any changes you don’t like.

 b2ap3_thumbnail_clean-up.jpg

Watson Analytics use machine learning to quicken data clean-up times
 

Watson Analytics uses cognitive computing and natural language processing to answer your questions

Where does Watson’s cognitive computing power fit into the Watson Analytics solution? Well, the platform uses natural language processing and cognitive computing to understand the question you have about your data and to quickly find answers. To ask a question of your data, all you have to do is type that question into the application interface and wait for your results.

 b2ap3_thumbnail_NLP.jpg

Watson Analytics uses NLP and cognitive computing
 

On first appearance, the interface seems almost like a search engine tool that queries the internet and returns matches, but Watson Analytics is actually quite distinct from that technology. Watson Analytics relies on natural language processing and text parsing to isolate and understand your question, and then uses its cognitive Watson engine to go into your dataset and generate models and visualizations that provide exact answers to that question.

Watson Analytics uses SPSS Modeler algorithms to auto-model your data

Like I said, Watson Analytics runs off of all the SPSS Modeler algorithms. Within that suite, it runs whichever algorithms are compatible with your underlying dataset and then scores each model to tell you how well they performed. Basically, it completes your pre-modeling for you and tells you which algorithms perform the best. You can then use that information to help you quickly build more advanced and customized data science models.

Watson Analytics doesn’t just give you one answer to the question you have about your data. Rather, it provides you detailed, scored results about several available answers and then allows you to decide for yourself which you think is the most appropriate.

 b2ap3_thumbnail_IMG_0550.jpg

“Watson Analytics is not deterministic, it’s probabilistic”… It’s “a new partnership between man and machine” – Steve Gold

b2ap3_thumbnail_IMG_0523a.jpg

“Watson Analytics is not deterministic, it’s probabilistic”

Watson Analytics generates visualizations to help you understand your data and tell your data-driven story

To help you understand your data insights and decide which model you prefer, Watson Analytics generates both static and interactive data visualizations that you can use to better understand the data insights that the platform derives. Watson Analytics offers a patented, interactive, data visualization bull’s eye to help you truly understand the data insights it generates.

b2ap3_thumbnail_bullseye.jpg

The Watson Analytics bull’s eye 
 

Jing Shyr is the Distinguished Engineer in charge of the statistical component of the Watson Analytics platform. She explained that interface was designed to be light and fun to use! The bull’s eye tool places your question, as the key focus of the analysis, at the very center of the visualization. The interactive component of the bull’s eye tool lets you adjust variables in your dataset and see how those changes and adjustments affect the answers to your target question. The bull’s eye design of this interactive tool also serves a dual purpose of really helping you to narrow your question and stay focused on what you’re exactly asking of your data. As we all know, when working with big datasets, staying narrowly focused and mindful of your goal can be a challenge. IBM designed the Watson Analytics interface to really help you stay narrow and focused on your data insight objectives.

Not only does Watson Analytics offer you interactive and static visualizations to help you make sense of the data insights the platform generates from your data, it also provides key offerings to help you communicate those insights to your audience. The platform will automatically generate infographics and other, more quantitative visualization types in a visual story-telling format so that you can easily and quickly convey results to your target audience.

“A smart visualization is worth a million data points, or more.” – Greg Adams, Distinguished Data Visualization Engineer

Lastly, Watson Analytics will offer a content store that you can use to access the pre-ingested external data you need to help put your internal data into perspective. This vastly simplifies the process of the creating comparative data visualizations and data mashups for use in presentations or even in data journalism publications.

Watson Analytics can handle big data’s volume, velocity, and variety

Watson is a cloud analytics solution that was designed to be able to ingest and process both structured and unstructured data from a variety of sources. The platform will be accessible to data scientists and developers through Big Blue. Watson Analytics will be released sometime in late 2014. To keep up with its release date, keep an eye out on the IBM Big Data and Analytics Hub for the latest news.

 Kevin Winterfield, Lillian Pierson, Aaron Waltz, Matt Carter (left to right)

Kevin Winterfield, Lillian Pierson, Aaron Waltz, Matt Carter (left to right)
 

Lastly, I’d like to give a HUGE THANK YOU to Kevin Winterfield and Matt Carter for inviting me out to learn about IBM’s Watson Analytics solution.