Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
    car expense data analytics
    Data Analytics for Smarter Vehicle Expense Management
    10 Min Read
    image fx (60)
    Data Analytics Driving the Modern E-commerce Warehouse
    13 Min Read
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Why Nobody Is Actually Analyzing Unstructured Data
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Why Nobody Is Actually Analyzing Unstructured Data
AnalyticsCollaborative Data

Why Nobody Is Actually Analyzing Unstructured Data

BillFranks
BillFranks
5 Min Read
SHARE

Unstructured data has been a very popular topic lately since so many big data sources are unstructured. However, an important nuance is often missed – the fact is that virtually no analytics directly analyze unstructured data. 

Unstructured data may be an input to an analytic process, but when it comes time to do any actual analysis, the unstructured data itself isn’t utilized. “How can that be?” you ask. Let me explain…

Unstructured data has been a very popular topic lately since so many big data sources are unstructured. However, an important nuance is often missed – the fact is that virtually no analytics directly analyze unstructured data. 

More Read

Case Study: Using Social Media and Text Analytics to Improve the Neiman Marcus Customer Experience
Explaining Real-Time Predictive Analytics with Big Data [VIDEO]
Takeaways From Your Next Predictive Analytics Conference
Federated Clouds
Comparing and Contrasting Two Innovative Analytics-Driven CMS Platforms

Unstructured data may be an input to an analytic process, but when it comes time to do any actual analysis, the unstructured data itself isn’t utilized. “How can that be?” you ask. Let me explain…

Let’s start with the example of fingerprint matching. If you watch shows like CSI, you see them match up fingerprints all the time. A fingerprint image is totally unstructured and also can be fairly large in size if the image is of high quality. So, when police on TV or in real life go to match fingerprints, do they match up actual images to find a match? No. What they do is first identify a set of important points on each print. Then, a map or polygon is created from those points. It is the map or polygon created from the prints that is actually matched.

More important is the fact that the map or polygon is fully structured and small in size, even though the original prints were not. While unstructured prints are an input to the process, the actual analysis to match them up doesn’t use the unstructured images, but rather structured information extracted from them.

 

 

An example everyone will appreciate is the analysis of text. Let’s consider the now popular approach of social media sentiment analysis. Are tweets, Facebook postings, and other social comments directly analyzed to determine their sentiment? Not really. The text is parsed into words or phrases. Then, those words and phrases are flagged as good or bad.

In a simple example, perhaps a “good” word gets a “1”, a “bad” word gets a “-1”, and a “neutral” word gets a “0”. The sentiment of the posting is determined by the sum of the individual word or phrase scores. Therefore, the sentiment score itself is created from fully structured numeric data that was derived from the initially unstructured source text. Any further analysis on trends or patterns in sentiment is based fully on the structured, numeric summaries of the text, not the text itself.

This same logic applies across the board. If you’re going to build a propensity model to predict customer behavior, you’re going to have to transform your unstructured data into structured, numeric extracts. That’s what the vast majority of analytic algorithms require. An argument can be made that extracting structured information from an unstructured source is a form of analysis itself. However, my point is simply that the final analysis, which is what started the process of acquiring the unstructured data to begin with, does not use the unstructured data. It uses the structured information that has been extracted from it. This is an important nuance.

One reason it is important is that it gets to the heart of how to handle unstructured big data sources in the long run. Clearly, some new tools can be useful to aid in the initial processing of unstructured data. However, once the information extraction step is complete, you’re left with a set of data that is fully structured and, typically, much smaller than what you had when you started. This makes the information much easier to incorporate into analytic processes and standard tools than most people think.

Through an appropriate information extraction process, a big data source can shrink to a much more manageable size and format. At that point, you can proceed with your analytics as usual. For this reason, the thought of using unstructured data really shouldn’t intimidate people as much as it often does.

Originally published by the International Institute for Analytics

TAGGED:unstructured data
Share This Article
Facebook Pinterest LinkedIn
Share
ByBillFranks
Follow:
Bill Franks is Chief Analytics Officer for The International Institute For Analytics (IIA). Franks is also the author of Taming The Big Data Tidal Wave and The Analytics Revolution. His work has spanned clients in a variety of industries for companies ranging in size from Fortune 100 companies to small non-profit organizations. You can learn more at http://www.bill-franks.com.

Follow us on Facebook

Latest News

cybersecurity essentials
Cybersecurity Essentials For Customer-Facing Platforms
Exclusive Infographic IT Security
ai for making lyric videos
How AI Is Revolutionizing Lyric Video Creation
Artificial Intelligence Exclusive
intersection of data and patient care
How Healthcare Careers Are Expanding at the Intersection of Data and Patient Care
Big Data Exclusive
dedicated servers for ai businesses
5 Reasons AI-Driven Business Need Dedicated Servers
Artificial Intelligence Exclusive News

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

using geographic data in analysis
Uncategorized

Using Geographic Data

8 Min Read

O Knowledge Graph, Where Art Thou?

4 Min Read

The Royal Wedding

3 Min Read
health apps use big data
Uncategorized

Big Data for Personal Use Is More Popular Than Ever

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?