Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Social Media Analytics: Performance Measurement Done Right
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Quality > Social Media Analytics: Performance Measurement Done Right
AnalyticsBusiness IntelligenceData QualitySentiment AnalyticsSocial DataText AnalyticsUnstructured Data

Social Media Analytics: Performance Measurement Done Right

psresnik
psresnik
8 Min Read
SHARE

By Philip Resnik, Ph.D. Lead Scientist at Converseon & ConveyAPI.

By Philip Resnik, Ph.D. Lead Scientist at Converseon & ConveyAPI.

For a buyer of social media analytics, comparing the performance of various technologies is nothing short of baffling. This is especially true with respect to sentiment analysis — indeed text analytics in general — where scientific jargon, marketing puffery, and a laundry list of features can often obscure what really matters: using a technology meant to measure human expression, are we obtaining the value of a human analysis?

More Read

Sentiment – A Potential Mirage on the Data Horizon
An API Bridge Between Big Data And Digital Marketing
BI 2010 – Optimizing revenue collection
Using Web 2.0 for Analytics 2.0
Predicting BI trends and saying you’re sorry

This notion of human performance as the ultimate goal is based on an important observation: when people analyze social media, we get valuable results.

When we built our social text analytics solutions, we recognized that, if only we could somehow take a few thousand people, shrink them and put them into a little box, and then get them to work thousands of times faster (to deal with seriously big data), we would have an incredible solution to our clients’ problems. Yes, people do make mistakes, and they disagree with each other about things. (Consider: “At this price point, I guess the smartphone meets the minimum requirements”. Three different people might fairly call this either positive or negative or neutral.) But even though human performance is imperfect, we know from our long-tested experience that human analysis provides all kinds of value that clients need.

So, when building and benchmarking our social media analysis technology, we set our sights on how close our system could get to human performance. One doesn’t need the technology to be 100% perfect, because people aren’t perfect, and we know people can get the job done just fine. (See the second paragraph again.) The right goal is for the technology to be as good as people.1

With that in mind, here’s how we’re approaching the measurement challenge. The first step is to figure out how well people can do at the analysis we care about, so we know what we’re aiming for. How can you do that? Well, take someone’s analysis and have a second person judge it. Hmm. Wait a second. How do we judge whether the second person is a good judge? Add a third person to judge the second person. How do you now judge whether the third person is a good — Uh oh. You see the problem.

The problem is that there’s no ultimate, ideal judge at the end of the line. Nobody’s perfect. (But that’s ok, because we know that when people do the job, it delivers great value despite those imperfections. See that second paragraph yet again.) As it turns out, there’s a different solution: let your three people take turns judging each other. Here’s how it works. Treat Person 1’s analysis as “truth”, and see how Persons 2 and 3 do. Then treat Person 2’s analysis as truth, and see how Persons 1 and 3 do. Then treat Person 3’s analysis as truth, and see how Persons 1 and 2 do. It turns out that if we take turns allowing each person to define the “true” analysis for the others, and then average out the results, we’ll get a statistically reliable number for human performance — without ever having to pick any one of them as the person who holds the ultimate “truth”. This will give us a number that we can call the average human performance. 2

If we want to know if our system is good, we’ll compare how it does to average human performance. It’s the same turn-taking idea all over again, this time comparing system to humans rather than comparing humans to humans. That is: Treat Person 1’s analysis as “truth” and see how the system does. Do it again with Person 2 as “truth”. And Person 3. Average those three numbers, and we’ve got raw system performance.

The final step: what we really want to know is, how close is the raw system performance to average human performance? To get this you divide the former by the latter to get percentage of human performance. For example, let’s suppose that the average human performance is 74%. That is, on average, humans agree with each other 74% of the time. (If that number seems low, yes, you guessed it; second paragraph.) Suppose Systems A and B turn in raw system performances of 69% and 59%, respectively. Is one system really better than the other? How can you tell? System A is achieving 69/74 = 93% of human performance. System B achieves 59/74 = 80% of human performance. Out of all this numbers soup comes something that you can translate into understandable terms: System A is within spitting distance of human performance, but System B isn’t even within shouting distance. System A is better. 3

What we’ve just described is a rigorous and transparent method for evaluating the performance of social analytics methods. When you’re evaluating technologies on your short list, we suggest you use this approach, too.

If you don’t have the resources for such a rigorous comparison, let us know, and we’ll lend you a hand.


1 In a seminal  paper about evaluation of language technology, Gale, Church, and Yarowsky established the idea of benchmarking systems against an upper bound defined by “the ability for human judges to agree with one another.” That’s been the standard in the field ever since. (William Gale, Kenneth Ward Church, and David Yarowsky. 1992. Estimating upper and lower bounds on the performance of word-sense disambiguation programs. In Proceedings of the 30th annual meeting on Association for Computational Linguistics (ACL ’92). Association for Computational Linguistics, Stroudsburg, PA, USA, 249-256. DOI=10.3115/981967.981999 http://dx.doi.org/10.3115/981967.981999).

2 This is an instance of a general statistical technique called cross validation.
3 You’re about to ask how we decide that 93% is “spitting distance” and 80% isn’t, aren’t you?  Fair enough.   But we never said that the buyer’s judgment wasn’t going to be important.   Our point is that you should be asking 93% of what and 80% of what, and the what should be defined in terms of the goal that matters to you.  If what you’re after is human-quality analysis, then percentage of  human performance is the right measure.  Subjectively we’ve found that if a system isn’t comfortably over 90% on this measure, it might be faster and more scalable, but it’s not providing the kind of quality that yields genuine insights for buyers.

TAGGED:APImeasurementsentiment analysistext analytics
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

AI role in medical industry
The Role Of AI In Transforming Medical Manufacturing
Artificial Intelligence Exclusive
b2b sales
Unseen Barriers: Identifying Bottlenecks In B2B Sales
Business Rules Exclusive Infographic
data intelligence in healthcare
How Data Is Powering Real-Time Intelligence in Health Systems
Big Data Exclusive
intersection of data
The Intersection of Data and Empathy in Modern Support Careers
Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Image
Data Mining

Mining for consumer behavior in Tweets

5 Min Read
Image
ExclusivePredictive Analytics

Inside a Consumer’s Mind with Text Analytics

5 Min Read

Social sentiment matters!

4 Min Read

Enabling Exploration Through Text Analytics

4 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?