Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
    data analytics and gold trading
    Data Analytics and the New Era of Gold Trading
    9 Min Read
    composable analytics
    How Composable Analytics Unlocks Modular Agility for Data Teams
    9 Min Read
    data mining to find the right poly bag makers
    Using Data Analytics to Choose the Best Poly Mailer Bags
    12 Min Read
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Text Analytics Is Hard (That’s What She Said)
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Predictive Analytics > Text Analytics Is Hard (That’s What She Said)
Predictive AnalyticsSentiment AnalyticsSocial Media AnalyticsText AnalyticsUnstructured Data

Text Analytics Is Hard (That’s What She Said)

metabrown
metabrown
7 Min Read
SHARE

You think math is hard? Hah. At least in math, there is a clearly defined right answer, unless you’re talking statistics, and even there we have well-defined methods and accepted ways of doing things.  If you like an analytics challenge, try taking on one of the more complex text analytics tasks – sentiment analysis, perhaps, or propensity modeling.

The subtleties of human language make automated text analysis a mighty tall order. Humans routinely disagree on the interpretation of human language, so what can we expect from computers?

You think math is hard? Hah. At least in math, there is a clearly defined right answer, unless you’re talking statistics, and even there we have well-defined methods and accepted ways of doing things.  If you like an analytics challenge, try taking on one of the more complex text analytics tasks – sentiment analysis, perhaps, or propensity modeling.

The subtleties of human language make automated text analysis a mighty tall order. Humans routinely disagree on the interpretation of human language, so what can we expect from computers?

More Read

Smart’ fridges that run on renewable electricity and are capable…
What do Data Miners Need to Learn?
The Secrets to Big Data and Information Optimization Revealed in 2013 Research Agenda
How To Create A 360-Degree Customer View Using Data
Growing Importance Of Predictive Analytics For Recovery Point Objectives

The other day I was hit with a new one – for me, at least. The question was – how would you write a classifier to identify sentences appropriate for the retort, “that’s what she said”. It turns out that identification of “that’s what she said” jokes in the making is rather popular among linguists. Go figure.

For those who are too polite to know this type of humor, let me explain. When speaking in a non-sexual context, we sometimes say things that are not funny, but which would be funny if the same words were uttered in a sexual context. A listener may detect the double meaning, and respond to your words with, “that’s what she said”, thus putting the remark into a sexual context, and creating a joke. Here’s an example:

Man 1: (looking at deli sandwiches) Wow, they’re much bigger than I expected.

Man 2: That’s what she said!

Now, you may not think this has much practical application, but don’t judge too quickly. This task has two important elements – identification of relevant terminology, and determining whether the statement is appropriate for a particular use. In this case, the terms of interest are words or phrases that can have both non-sexual and sexual implications, and the use that interests us is humor. That’s a lot like tasks with obvious commercial applications, where the relevant terminology might refer to a product and the use that interests us is intent to purchase.

Considering the “that’s what she said” joke, we understand that our friends will not tolerate many of these quips in a day. A particularly good one may yield laughs – a social positive, but a bad one does nothing for the joker’s reputation, and frequent bad ones quickly become a social negative. The human risk reduction strategy is to use this form of humor infrequently, and only when the opening seems most likely to result in laughter.

Could a machine identify these joke opportunities and prioritize them similarly?

In their paper, “That’s What She Said: Double Entendre Identification”, Chloe Kiddon and Yuriy Brun of the University of Washington, demonstrate that automated identification of jokes in the making is possible. They explain:

A “that’s what she said” (TWSS) joke is a type of double entendre. A double entendre, or adianoeta, is an expression that can be understood in two different ways: an innocuous, straightforward way, given the context, and a risque way that indirectly alludes to a different, indecent context.

Until recently, the literature in natural language processing had not taken on the identification of the double entendre. Kiddon and Brun approach this in a practical way, explaining the social costs of a failed joke, and weighing these against the rewards of success. They observe that statements are more likely to be funny “that’s what she said” jokes when they include “nouns that are euphemisms for sexually explicit nouns” and “share common structure with sentences in the erotic domain”.

Nouns like “banana” are likely to be funny, structures like “[subject] could eat [object] all day” are likely to be funny. Therefore, this is likely to be funny:

Man 1: I could eat bananas all day.

Man 2: That’s what she said!

They go on to develop a technique that identifies good “that’s what she said” candidates. Their paper explains the alternatives evaluated while developing the method they call “Double Entendre via Noun Transfer (DEviaNT).”

Why does this matter to you? It is a model for taking on a difficult analytics challenge and developing an effective solution. The researchers maximize their potential for success by beginning with a valuable process that most business users of text analytics ignore. They…

1) define the application (how the results will be used),

2) assess the value and risks of using the information, and

3) narrow the task to something that is reasonably attainable.

Do you do all that stuff?

Most of the text analytics industry is pushing product as a means to obtaining insight. Just what is the value of insight? Many, perhaps most, businesses that invest in text analytics do so without an explicit plan for using the information or obtaining measurable returns. And many, many organizations reject the option of using text analytics altogether, on the grounds that it isn’t perfect. Is it any wonder that we hear of few case studies showing clear ROI for text analytics?

Before your next foray into text analytics, determine how you will use the results, assess the value and risks of using those results, and define a reasonable scope for your project. It’s a roadmap to maximizing your odds of improving the bottom line with text analytics, and being able to prove it.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

street address database
Why Data-Driven Companies Rely on Accurate Street Address Databases
Big Data Exclusive
predictive analytics risk management
How Predictive Analytics Is Redefining Risk Management Across Industries
Analytics Exclusive Predictive Analytics
data analytics and gold trading
Data Analytics and the New Era of Gold Trading
Analytics Big Data Exclusive
student learning AI
Advanced Degrees Still Matter in an AI-Driven Job Market
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Images from “Contact lenses with circuits, lights a…

0 Min Read

Importance of Social Media Analytics

4 Min Read

Maximize Your Market Research Investment in a Recession

6 Min Read

Think Mid Data, and Triangulate: Tom H.C. Anderson on Next Generation Research Methods

8 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?