Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
    data analytics and truck accident claims
    How Data Analytics Reduces Truck Accidents and Speeds Up Claims
    7 Min Read
    predictive analytics for interior designers
    Interior Designers Boost Profits with Predictive Analytics
    8 Min Read
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Text Analytics Is Hard (That’s What She Said)
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Predictive Analytics > Text Analytics Is Hard (That’s What She Said)
Predictive AnalyticsSentiment AnalyticsSocial Media AnalyticsText AnalyticsUnstructured Data

Text Analytics Is Hard (That’s What She Said)

metabrown
metabrown
7 Min Read
SHARE

You think math is hard? Hah. At least in math, there is a clearly defined right answer, unless you’re talking statistics, and even there we have well-defined methods and accepted ways of doing things.  If you like an analytics challenge, try taking on one of the more complex text analytics tasks – sentiment analysis, perhaps, or propensity modeling.

The subtleties of human language make automated text analysis a mighty tall order. Humans routinely disagree on the interpretation of human language, so what can we expect from computers?

You think math is hard? Hah. At least in math, there is a clearly defined right answer, unless you’re talking statistics, and even there we have well-defined methods and accepted ways of doing things.  If you like an analytics challenge, try taking on one of the more complex text analytics tasks – sentiment analysis, perhaps, or propensity modeling.

The subtleties of human language make automated text analysis a mighty tall order. Humans routinely disagree on the interpretation of human language, so what can we expect from computers?

More Read

Behind the scenes of REvolution’s 64-bit Windows port of R
Building an Analytical Portal to Support Analytical Culture
Predicting the next Viral Tweet
IBM will leverage its global technology capabilities to manage…
3 Major Reasons VPN Can Improve Data Security

The other day I was hit with a new one – for me, at least. The question was – how would you write a classifier to identify sentences appropriate for the retort, “that’s what she said”. It turns out that identification of “that’s what she said” jokes in the making is rather popular among linguists. Go figure.

For those who are too polite to know this type of humor, let me explain. When speaking in a non-sexual context, we sometimes say things that are not funny, but which would be funny if the same words were uttered in a sexual context. A listener may detect the double meaning, and respond to your words with, “that’s what she said”, thus putting the remark into a sexual context, and creating a joke. Here’s an example:

Man 1: (looking at deli sandwiches) Wow, they’re much bigger than I expected.

Man 2: That’s what she said!

Now, you may not think this has much practical application, but don’t judge too quickly. This task has two important elements – identification of relevant terminology, and determining whether the statement is appropriate for a particular use. In this case, the terms of interest are words or phrases that can have both non-sexual and sexual implications, and the use that interests us is humor. That’s a lot like tasks with obvious commercial applications, where the relevant terminology might refer to a product and the use that interests us is intent to purchase.

Considering the “that’s what she said” joke, we understand that our friends will not tolerate many of these quips in a day. A particularly good one may yield laughs – a social positive, but a bad one does nothing for the joker’s reputation, and frequent bad ones quickly become a social negative. The human risk reduction strategy is to use this form of humor infrequently, and only when the opening seems most likely to result in laughter.

Could a machine identify these joke opportunities and prioritize them similarly?

In their paper, “That’s What She Said: Double Entendre Identification”, Chloe Kiddon and Yuriy Brun of the University of Washington, demonstrate that automated identification of jokes in the making is possible. They explain:

A “that’s what she said” (TWSS) joke is a type of double entendre. A double entendre, or adianoeta, is an expression that can be understood in two different ways: an innocuous, straightforward way, given the context, and a risque way that indirectly alludes to a different, indecent context.

Until recently, the literature in natural language processing had not taken on the identification of the double entendre. Kiddon and Brun approach this in a practical way, explaining the social costs of a failed joke, and weighing these against the rewards of success. They observe that statements are more likely to be funny “that’s what she said” jokes when they include “nouns that are euphemisms for sexually explicit nouns” and “share common structure with sentences in the erotic domain”.

Nouns like “banana” are likely to be funny, structures like “[subject] could eat [object] all day” are likely to be funny. Therefore, this is likely to be funny:

Man 1: I could eat bananas all day.

Man 2: That’s what she said!

They go on to develop a technique that identifies good “that’s what she said” candidates. Their paper explains the alternatives evaluated while developing the method they call “Double Entendre via Noun Transfer (DEviaNT).”

Why does this matter to you? It is a model for taking on a difficult analytics challenge and developing an effective solution. The researchers maximize their potential for success by beginning with a valuable process that most business users of text analytics ignore. They…

1) define the application (how the results will be used),

2) assess the value and risks of using the information, and

3) narrow the task to something that is reasonably attainable.

Do you do all that stuff?

Most of the text analytics industry is pushing product as a means to obtaining insight. Just what is the value of insight? Many, perhaps most, businesses that invest in text analytics do so without an explicit plan for using the information or obtaining measurable returns. And many, many organizations reject the option of using text analytics altogether, on the grounds that it isn’t perfect. Is it any wonder that we hear of few case studies showing clear ROI for text analytics?

Before your next foray into text analytics, determine how you will use the results, assess the value and risks of using those results, and define a reasonable scope for your project. It’s a roadmap to maximizing your odds of improving the bottom line with text analytics, and being able to prove it.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

AI role in medical industry
The Role Of AI In Transforming Medical Manufacturing
Artificial Intelligence Exclusive
b2b sales
Unseen Barriers: Identifying Bottlenecks In B2B Sales
Business Rules Exclusive Infographic
data intelligence in healthcare
How Data Is Powering Real-Time Intelligence in Health Systems
Big Data Exclusive
intersection of data
The Intersection of Data and Empathy in Modern Support Careers
Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

gaming industry
AnalyticsExclusivePredictive Analytics

Predictive Analytics Is Reshaping UX In The Global Gaming Industry

5 Min Read

Warranty Management – New rules to apply

4 Min Read

Improving Search Engine Optimization by Incorporating Predictive Analytics

5 Min Read
analytics vendor
AnalyticsBig DataBusiness RulesData MiningData VisualizationJobsKnowledge ManagementMarket ResearchModelingPolicy and GovernancePredictive AnalyticsSentiment AnalyticsSocial DataSocial Media AnalyticsText AnalyticsUnstructured DataWeb Analytics

Great Analytics Vendors: 5 Must-Have Traits

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?