Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data mining to find the right poly bag makers
    Using Data Analytics to Choose the Best Poly Mailer Bags
    12 Min Read
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
    car expense data analytics
    Data Analytics for Smarter Vehicle Expense Management
    10 Min Read
    image fx (60)
    Data Analytics Driving the Modern E-commerce Warehouse
    13 Min Read
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Text Mining and Pronouns
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Text Analytics > Text Mining and Pronouns
Text Analytics

Text Mining and Pronouns

mekkin
mekkin
5 Min Read
Image
SHARE

ImageIf there’s one piece of advice I can offer you, both for better text mining and better writing, it is this: please, please, please with a cherry on top, be clear with your pronouns.

ImageIf there’s one piece of advice I can offer you, both for better text mining and better writing, it is this: please, please, please with a cherry on top, be clear with your pronouns.

There’s nothing that makes me more sad than a lost, lonely pronoun separated from its antecedent, or the noun to which the pronoun is referring. The process of determining pronoun ownership, and thereby determining who or what is being spoken about in a particular phrase, is something we call anaphora resolution, and it’s something we’ve been working on for a long time. If we were to say “Jenny wanted to try something new, so she went to yoga class”, anaphora resolution would be identifying the pronoun “she” as referring to Jenny.

Text mining engines have varying degrees of success depending on the pronouns involved. Most of these engines have a general model used to look at two qualities to determine pronoun ownership:

More Read

Three Ways to Analytic Impact
Text Mining & Analytics – Correlating Social Intelligence with Traditional Data
The Experts of Text!
Event Detection: Analytics Becoming More Personal
NGMR Guru Interview with Jeff Jonas of IBM

1. How far apart are the pronoun and referring noun?

2. Do the pronoun and the referring noun look alike?

Distance is a good way to track the referring entities. Good writers, like good pet owners, keep their pronouns on a short leash. If the pronoun is “she”, for example, in all likelihood, the person being referred to will be the last woman introduced by name, and will be in the previous sentence, or at least in the same paragraph. Ex. “Jenny met a guy in yoga class. She is going on a date tonight.” This is obviously easier to figure out if the woman in question has what is traditionally understood as a female name like Jenny, which brings us to our second strategy for anaphora resolution.

If pronouns and nouns “look alike”, they’ll share a certain quality that allows us to rule out other antecedents. As in the instance above, the pronoun “she” is most likely attached to a woman’s name. Ex. “Jenny and Dave went on a date, but she faked food poisoning to get out early.” In this instance, “she” is probably not referring to Dave, because Dave is a traditionally male name. 

The same principle holds true for nominal pronouns. Nominal pronouns are the kind of pronoun we employ when we write “the company” in an article when we are referring to, say, Google. We know which company is being referred to, because it is the topic of discussion, but we aren’t using the proper noun. In this case the look-alike strategy works very well. For example, Lexalytics’ text mining engine Salience knows Google is a company, and so can easily attach it to the nominal pronoun “the company”. 

Salience also looks for other qualities to ensure the best anaphora resolution possible. For instance, it looks for quotation marks in conjunction with the use of the pronoun “I”, so that it doesn’t confuse the author with someone else who is being quoted. For example, “‘I’ll have to find a new yoga class,’ says Jenny”. The “I” is in quotations so it belongs to Jenny, the speaker. However, when I use “I” without quotations, the pronoun belongs to me, Mekkin, the author.

But it’s not all fun and games. Not when it comes to the unfathomable “it”. With its lack of defining characteristics, it can cause text mining mayham. For that reason, most text mining engines choose to ignore it altogether.

That’s all for today. Please remember to be a responsible writer: spay and neuter your expletives, and keep your pronouns close to home.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data mining to find the right poly bag makers
Using Data Analytics to Choose the Best Poly Mailer Bags
Analytics Big Data Exclusive
data science importance of flexibility
Why Flexibility Defines the Future of Data Science
Big Data Exclusive
payment methods
How Data Analytics Is Transforming eCommerce Payments
Business Intelligence
cybersecurity essentials
Cybersecurity Essentials For Customer-Facing Platforms
Exclusive Infographic IT Security

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Text Analytics Is Hard (That’s What She Said)

7 Min Read

Write on The Emerging Role of the Analyst – SDC’s Analytics Blogarama Oct 6

2 Min Read

Determining Perception Gap Through Twitter [INFOGRAPHIC]

1 Min Read

Business Intelligence – The Power of Human Emotion

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?