Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
    car expense data analytics
    Data Analytics for Smarter Vehicle Expense Management
    10 Min Read
    image fx (60)
    Data Analytics Driving the Modern E-commerce Warehouse
    13 Min Read
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
    sales and data analytics
    How Data Analytics Improves Lead Management and Sales Results
    9 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Text Analytics Is Hard (That’s What She Said)
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Predictive Analytics > Text Analytics Is Hard (That’s What She Said)
Predictive AnalyticsSentiment AnalyticsSocial Media AnalyticsText AnalyticsUnstructured Data

Text Analytics Is Hard (That’s What She Said)

metabrown
metabrown
7 Min Read
SHARE

You think math is hard? Hah. At least in math, there is a clearly defined right answer, unless you’re talking statistics, and even there we have well-defined methods and accepted ways of doing things.  If you like an analytics challenge, try taking on one of the more complex text analytics tasks – sentiment analysis, perhaps, or propensity modeling.

The subtleties of human language make automated text analysis a mighty tall order. Humans routinely disagree on the interpretation of human language, so what can we expect from computers?

You think math is hard? Hah. At least in math, there is a clearly defined right answer, unless you’re talking statistics, and even there we have well-defined methods and accepted ways of doing things.  If you like an analytics challenge, try taking on one of the more complex text analytics tasks – sentiment analysis, perhaps, or propensity modeling.

The subtleties of human language make automated text analysis a mighty tall order. Humans routinely disagree on the interpretation of human language, so what can we expect from computers?

More Read

First Look – Causata
Exploring Technological Horizons with Recorded Future
Decision Sciences in Healthcare – Academic Research Request
How “Dirty Data” Derails Your Company’s Data Analytics and ROI
Coming Trends in Analytics Application and Implementation

The other day I was hit with a new one – for me, at least. The question was – how would you write a classifier to identify sentences appropriate for the retort, “that’s what she said”. It turns out that identification of “that’s what she said” jokes in the making is rather popular among linguists. Go figure.

For those who are too polite to know this type of humor, let me explain. When speaking in a non-sexual context, we sometimes say things that are not funny, but which would be funny if the same words were uttered in a sexual context. A listener may detect the double meaning, and respond to your words with, “that’s what she said”, thus putting the remark into a sexual context, and creating a joke. Here’s an example:

Man 1: (looking at deli sandwiches) Wow, they’re much bigger than I expected.

Man 2: That’s what she said!

Now, you may not think this has much practical application, but don’t judge too quickly. This task has two important elements – identification of relevant terminology, and determining whether the statement is appropriate for a particular use. In this case, the terms of interest are words or phrases that can have both non-sexual and sexual implications, and the use that interests us is humor. That’s a lot like tasks with obvious commercial applications, where the relevant terminology might refer to a product and the use that interests us is intent to purchase.

Considering the “that’s what she said” joke, we understand that our friends will not tolerate many of these quips in a day. A particularly good one may yield laughs – a social positive, but a bad one does nothing for the joker’s reputation, and frequent bad ones quickly become a social negative. The human risk reduction strategy is to use this form of humor infrequently, and only when the opening seems most likely to result in laughter.

Could a machine identify these joke opportunities and prioritize them similarly?

In their paper, “That’s What She Said: Double Entendre Identification”, Chloe Kiddon and Yuriy Brun of the University of Washington, demonstrate that automated identification of jokes in the making is possible. They explain:

A “that’s what she said” (TWSS) joke is a type of double entendre. A double entendre, or adianoeta, is an expression that can be understood in two different ways: an innocuous, straightforward way, given the context, and a risque way that indirectly alludes to a different, indecent context.

Until recently, the literature in natural language processing had not taken on the identification of the double entendre. Kiddon and Brun approach this in a practical way, explaining the social costs of a failed joke, and weighing these against the rewards of success. They observe that statements are more likely to be funny “that’s what she said” jokes when they include “nouns that are euphemisms for sexually explicit nouns” and “share common structure with sentences in the erotic domain”.

Nouns like “banana” are likely to be funny, structures like “[subject] could eat [object] all day” are likely to be funny. Therefore, this is likely to be funny:

Man 1: I could eat bananas all day.

Man 2: That’s what she said!

They go on to develop a technique that identifies good “that’s what she said” candidates. Their paper explains the alternatives evaluated while developing the method they call “Double Entendre via Noun Transfer (DEviaNT).”

Why does this matter to you? It is a model for taking on a difficult analytics challenge and developing an effective solution. The researchers maximize their potential for success by beginning with a valuable process that most business users of text analytics ignore. They…

1) define the application (how the results will be used),

2) assess the value and risks of using the information, and

3) narrow the task to something that is reasonably attainable.

Do you do all that stuff?

Most of the text analytics industry is pushing product as a means to obtaining insight. Just what is the value of insight? Many, perhaps most, businesses that invest in text analytics do so without an explicit plan for using the information or obtaining measurable returns. And many, many organizations reject the option of using text analytics altogether, on the grounds that it isn’t perfect. Is it any wonder that we hear of few case studies showing clear ROI for text analytics?

Before your next foray into text analytics, determine how you will use the results, assess the value and risks of using those results, and define a reasonable scope for your project. It’s a roadmap to maximizing your odds of improving the bottom line with text analytics, and being able to prove it.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

intersection of data and patient care
How Healthcare Careers Are Expanding at the Intersection of Data and Patient Care
Big Data Exclusive
dedicated servers for ai businesses
5 Reasons AI-Driven Business Need Dedicated Servers
Artificial Intelligence Exclusive News
data analytics for pharmacy trends
How Data Analytics Is Tracking Trends in the Pharmacy Industry
Analytics Big Data Exclusive
ai call centers
Using Generative AI Call Center Solutions to Improve Agent Productivity
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

It’s not AI but…

3 Min Read

Analytics BS: 3 Questions to Spot It

11 Min Read
big data in linkedin marketing
Analytics

Data-Driven LinkedIn Marketing Tips to Try In 2021

10 Min Read

Accuracy not just confidence – some thoughts after attending SAS Global Forum 2009

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?