Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
    car expense data analytics
    Data Analytics for Smarter Vehicle Expense Management
    10 Min Read
    image fx (60)
    Data Analytics Driving the Modern E-commerce Warehouse
    13 Min Read
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Social Media Analytics: Performance Measurement Done Right
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Quality > Social Media Analytics: Performance Measurement Done Right
AnalyticsBusiness IntelligenceData QualitySentiment AnalyticsSocial DataText AnalyticsUnstructured Data

Social Media Analytics: Performance Measurement Done Right

psresnik
psresnik
8 Min Read
SHARE

By Philip Resnik, Ph.D. Lead Scientist at Converseon & ConveyAPI.

By Philip Resnik, Ph.D. Lead Scientist at Converseon & ConveyAPI.

For a buyer of social media analytics, comparing the performance of various technologies is nothing short of baffling. This is especially true with respect to sentiment analysis — indeed text analytics in general — where scientific jargon, marketing puffery, and a laundry list of features can often obscure what really matters: using a technology meant to measure human expression, are we obtaining the value of a human analysis?

More Read

The Disconnect Between How Vendors and Strategists Approach Social CRM
Increased Demands for BI within Healthcare IT
Deep Learning Would Be Crucial Under Sanders’s Medicare for All System
Using Data Analysis to Improve and Verify the Customer Experience and Bad Reviews
HR Analytics is the Basis of New Workforce Management Software

This notion of human performance as the ultimate goal is based on an important observation: when people analyze social media, we get valuable results.

When we built our social text analytics solutions, we recognized that, if only we could somehow take a few thousand people, shrink them and put them into a little box, and then get them to work thousands of times faster (to deal with seriously big data), we would have an incredible solution to our clients’ problems. Yes, people do make mistakes, and they disagree with each other about things. (Consider: “At this price point, I guess the smartphone meets the minimum requirements”. Three different people might fairly call this either positive or negative or neutral.) But even though human performance is imperfect, we know from our long-tested experience that human analysis provides all kinds of value that clients need.

So, when building and benchmarking our social media analysis technology, we set our sights on how close our system could get to human performance. One doesn’t need the technology to be 100% perfect, because people aren’t perfect, and we know people can get the job done just fine. (See the second paragraph again.) The right goal is for the technology to be as good as people.1

With that in mind, here’s how we’re approaching the measurement challenge. The first step is to figure out how well people can do at the analysis we care about, so we know what we’re aiming for. How can you do that? Well, take someone’s analysis and have a second person judge it. Hmm. Wait a second. How do we judge whether the second person is a good judge? Add a third person to judge the second person. How do you now judge whether the third person is a good — Uh oh. You see the problem.

The problem is that there’s no ultimate, ideal judge at the end of the line. Nobody’s perfect. (But that’s ok, because we know that when people do the job, it delivers great value despite those imperfections. See that second paragraph yet again.) As it turns out, there’s a different solution: let your three people take turns judging each other. Here’s how it works. Treat Person 1’s analysis as “truth”, and see how Persons 2 and 3 do. Then treat Person 2’s analysis as truth, and see how Persons 1 and 3 do. Then treat Person 3’s analysis as truth, and see how Persons 1 and 2 do. It turns out that if we take turns allowing each person to define the “true” analysis for the others, and then average out the results, we’ll get a statistically reliable number for human performance — without ever having to pick any one of them as the person who holds the ultimate “truth”. This will give us a number that we can call the average human performance. 2

If we want to know if our system is good, we’ll compare how it does to average human performance. It’s the same turn-taking idea all over again, this time comparing system to humans rather than comparing humans to humans. That is: Treat Person 1’s analysis as “truth” and see how the system does. Do it again with Person 2 as “truth”. And Person 3. Average those three numbers, and we’ve got raw system performance.

The final step: what we really want to know is, how close is the raw system performance to average human performance? To get this you divide the former by the latter to get percentage of human performance. For example, let’s suppose that the average human performance is 74%. That is, on average, humans agree with each other 74% of the time. (If that number seems low, yes, you guessed it; second paragraph.) Suppose Systems A and B turn in raw system performances of 69% and 59%, respectively. Is one system really better than the other? How can you tell? System A is achieving 69/74 = 93% of human performance. System B achieves 59/74 = 80% of human performance. Out of all this numbers soup comes something that you can translate into understandable terms: System A is within spitting distance of human performance, but System B isn’t even within shouting distance. System A is better. 3

What we’ve just described is a rigorous and transparent method for evaluating the performance of social analytics methods. When you’re evaluating technologies on your short list, we suggest you use this approach, too.

If you don’t have the resources for such a rigorous comparison, let us know, and we’ll lend you a hand.


1 In a seminal  paper about evaluation of language technology, Gale, Church, and Yarowsky established the idea of benchmarking systems against an upper bound defined by “the ability for human judges to agree with one another.” That’s been the standard in the field ever since. (William Gale, Kenneth Ward Church, and David Yarowsky. 1992. Estimating upper and lower bounds on the performance of word-sense disambiguation programs. In Proceedings of the 30th annual meeting on Association for Computational Linguistics (ACL ’92). Association for Computational Linguistics, Stroudsburg, PA, USA, 249-256. DOI=10.3115/981967.981999 http://dx.doi.org/10.3115/981967.981999).

2 This is an instance of a general statistical technique called cross validation.
3 You’re about to ask how we decide that 93% is “spitting distance” and 80% isn’t, aren’t you?  Fair enough.   But we never said that the buyer’s judgment wasn’t going to be important.   Our point is that you should be asking 93% of what and 80% of what, and the what should be defined in terms of the goal that matters to you.  If what you’re after is human-quality analysis, then percentage of  human performance is the right measure.  Subjectively we’ve found that if a system isn’t comfortably over 90% on this measure, it might be faster and more scalable, but it’s not providing the kind of quality that yields genuine insights for buyers.

TAGGED:APImeasurementsentiment analysistext analytics
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

intersection of data and patient care
How Healthcare Careers Are Expanding at the Intersection of Data and Patient Care
Big Data Exclusive
dedicated servers for ai businesses
5 Reasons AI-Driven Business Need Dedicated Servers
Artificial Intelligence Exclusive News
data analytics for pharmacy trends
How Data Analytics Is Tracking Trends in the Pharmacy Industry
Analytics Big Data Exclusive
ai call centers
Using Generative AI Call Center Solutions to Improve Agent Productivity
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

hands on text analytics tutorial
AnalyticsExclusiveText Analytics

An Introduction To Hands-On Text Analytics In Python

7 Min Read
big data use digital marketing
Big DataBusiness IntelligenceMarketing

An API Bridge Between Big Data And Digital Marketing

10 Min Read

Big Data, Unstructured Information Analysis is More Than Sentiment.

4 Min Read

Twitter and Text Analysis to Help You Surf Through Traffic

4 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?