By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics in sports industry
    Here’s How Data Analytics In Sports Is Changing The Game
    6 Min Read
    data analytics on nursing career
    Advances in Data Analytics Are Rapidly Transforming Nursing
    8 Min Read
    data analytics reveals the benefits of MBA
    Data Analytics Technology Proves Benefits of an MBA
    9 Min Read
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: The Technology behind Social Media Analytics – An interview with Greg Greenstreet, CTO, SVP Engineering of Collective Intellect
Share
Notification Show More
Latest News
data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Predictive Analytics > The Technology behind Social Media Analytics – An interview with Greg Greenstreet, CTO, SVP Engineering of Collective Intellect
Business IntelligencePredictive Analytics

The Technology behind Social Media Analytics – An interview with Greg Greenstreet, CTO, SVP Engineering of Collective Intellect

Jennifer Roberts
Last updated: 2011/02/02 at 5:51 PM
Jennifer Roberts
11 Min Read
SHARE

Recently, I had the great opportunity to sit down with Greg Greentstreet, our CTO and SVP of Engineering here at Collective Intellect. Many of our most recent blog posts are about the uses of social media analytics or trends and insight in Social CRM that we thought it might be a good time to talk about the technology behind Collective Intellect. Greg had a lot of patience with me as he described the differences between semantic analysis, Boolean search and natural language processing.

Contents
What’s the biggest change you have seen in social media analytics and what are the different technologies being used to analyze social media conversations?Let’s talk about the technology CI uses in CI:Insight, our social media analytics tool. What makes it different than keyword search or NLP techniques?Semantic SearchKeyword SearchOnce we have an accurate and robust sample, what happens next? How does the technology optimize the data for analysis?We’ve talked a little about topic categorization but what do you mean by theme or trait extraction?

Recently, I had the great opportunity to sit down with Greg Greentstreet, our CTO and SVP of Engineering here at Collective Intellect. Many of our most recent blog posts are about the uses of social media analytics or trends and insight in Social CRM that we thought it might be a good time to talk about the technology behind Collective Intellect. Greg had a lot of patience with me as he described the differences between semantic analysis, Boolean search and natural language processing. We talked about why data accuracy is more important than data integrity and trends he sees in the future. The interview is divided into two-parts, with the first part talking about how our technology works and the second is devoted to how the data is organized and configured to surface trends, themes and audience traits and profiles.

What’s the biggest change you have seen in social media analytics and what are the different technologies being used to analyze social media conversations?

There are a couple of significant changes I’ve seen developing over the past 12 -18 month. Today, in order to remain relevant and competitive, sophisticated and analytically savvy organizations must move beyond awareness metrics provided by early monitoring and analytic technologies and pursue in-depth, contextually relevant information.

About a year ago many companies that were just getting started used monitoring platforms used technology that relied on basic keywords or Boolean term expressions, which were easy to use and implement. But they quickly learned that these types of tools have short-lived value especially if your analysis involved ambiguous language. These types of solutions presume you know all the terms that might be used to refer to a specific term.  If you look at a term like “Crocs”, which can refer to the popular shoe or the reptile in a conversation, you’d have to continuously include or exclude content on the basis of keyword matching because keyword matching alone fails to disambiguate the meaning of terms.

More Read

ai in ppc advertising

5 Proven Tips for Utilizing AI with PPC Advertising in 2023

5 Ways AI Technology Has Disrupted Website Development
Fortifying Enterprise Digital Security Against Hackers Weaponizing AI
10 Ways How Artificial Intelligence Is Changing the Content Writing Landscape
How IoT Can Be Connected to Business Intelligence

Some monitoring tools use Linguistics Rules-Based NLP techniques in a further attempt to disambiguate content.  This technique can be costly both in terms of time to develop the complex models involved, as well as the time it takes to process each textual item.  It also requires additional linguistics rule sets anytime the context of conversation shifts, making it difficult to apply to unstructured textual data sets like social media.  Collective Intellect’s solution addresses the inaccuracy and bluntness of keyword search and the speed and cost disadvantages of NLP techniques through the use of advanced statistical language modeling.

Let’s talk about the technology CI uses in CI:Insight, our social media analytics tool. What makes it different than keyword search or NLP techniques?

CI’s semantic engine is based on LSA, an advanced form of statistical language modeling.   LSA is a method for exposing latent contextual-meaning within a large body of text. It does this by looking at word usage (specifically, word co-occurrence) within a set of documents. Words that appear in similar contexts are assumed to have similar meaning and/or relational significance.  LSA constructs a large matrix of term-document association data.  Each cell in the matrix contains a weighted value, which is proportional to the number of times each term appears within each document in the set. The weights are structured such that more rare terms have greater weights. This allows more relevant terms to carry more weight to construct more accurate vectors of how consumers are talking about a category, brand or product.  This technique deciphers the relationships and correlations between words and plots where they dimensionally reside in proximity to a specific topic of interest.  LSA extracts specialized language features from a large data set and selects conversations based on their meaning. By isolating the contextual meaning of a topic, semantic filtering minimizes miss-categorizations (false positives) and inappropriate rejections (false negatives) that can otherwise occur when using other techniques and technologies. The resulting categorization is more relevant and pertinent to a research query. LSA learns in much the way the human brain does, by recognizing the context of language from the all of previous times it has seen a term within that context.  This produces a technology that can accurately disambiguate a term that is used in multiple contexts.

Take a look at the image, it illustrates the volume of invalid data received when relying on keyword or Boolean search, as compared to semantic filtering with the common term “Goldfish” as it relates to the brand of crackers:

Now imagine trying to write a Boolean keyword expression to capture conversations about a topic categorization like ‘Crocs’, the shoes; the expression quickly becomes unmanageable as negative terms are added in an attempt to exclude references to ‘croc’odiles. By using a semantic filter, CI’s Social CRM Insight solution isolates content in the shoes and sandals category and employs a simple keyword search – “Crocs” – to categorize content without having to worry about false positives occurring from crocodiles.

Semantic Search

“…Speaking of comfy, when is the Crocs craze going to end? It’s winter, and although it isn’t snowing everywhere, it’s snowing in Ohio. Why are people wearing Crocs with several pairs of socks in order to keep dry? I could understand if they were Louboutin’s, but honestly people.”

Keyword Search

“THESE baby crocs may look like cute pets – but beware. Measuring 30cm they will eventually reach three meters in length and could live to 80…”

Once we have an accurate and robust sample, what happens next? How does the technology optimize the data for analysis?

CI’s technology is used in a compounding fashion, starting with topic categorization, to theme extraction, then to trait extraction. CI’s semantic search and analytics technology is unique with its proprietary approach to how data is handled, categorized and measured for relevancy.  The proprietary technologies isolate important attributes from groups of authors and reveal unique considerations and preferences in addition to providing the ability to identify unknown associations occurring through natural online conversation.

We’ve talked a little about topic categorization but what do you mean by theme or trait extraction?

Let’s take theme extraction first.  Semantic analysis can be used to generate more meaningful themes associated with a topic. By coupling state-of-the-art clustering algorithms with semantic proximity measures, themes are derived by grouping semantically similar posts. This gives CI the ability to parse out various conversations occurring within a topic.

CI’s semantic filtering technology produces more meaningful themes than those produced by simple keyword term occurrence techniques seen in typical topic Tag Clouds that produce meaningless lists of top terms by simple counts only.  These techniques do not employ the use of contextual relevancy and therefore are saddled with an inherent limitation– the ability to understand the conversations underlying a particular topic. For example, if the topic is iPhone, it could be expected that iPhone would emerge as the biggest “tag” (theme) which in and of itself renders the term meaningless because a topic should not be a theme unto itself.

Using CI’s semantic themes, you can see true clusters of conversation based on meaning and then use those themes to create filter. As you create filters by accepting or rejecting themes, you can immediately test the filters to see if they are targeting the exact content you want. You can continue to iterate and add themes as accept or reject filters until only the content you want is passing through.  Once you have refined a set of filters that produce accurate data, you can then apply them to larger datasets or use them to categorize content as a ‘topic’ in CI’s continual stream of comprehensive social media data.

Next week, we’ll dig deeper into trait extraction, customer preferences, profiles and insights and the future of integrated social media analytics.

Jennifer Roberts February 2, 2011
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

ai in ppc advertising
Artificial Intelligence

5 Proven Tips for Utilizing AI with PPC Advertising in 2023

10 Min Read
ai in web design
Artificial Intelligence

5 Ways AI Technology Has Disrupted Website Development

7 Min Read
Digital Security From Weaponized AI
Security

Fortifying Enterprise Digital Security Against Hackers Weaponizing AI

11 Min Read
AI-powered content writing tools
Artificial Intelligence

10 Ways How Artificial Intelligence Is Changing the Content Writing Landscape

8 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?