Be a Text Analytics Heretic
Promises, promises! Text analytics literature is full of them. Gain valuable insights! Know your customer! Harness the power of Big Data! And so on, and so on. And hey, who doesn’t want valuable insights? The problem is that knowing something about your customer isn’t the same as having the ability to turn that information into cold hard cash.
If cold, hard cash is what you’re after, stop messing around looking for the charm and beauty in your text. Don’t look for perfection. Look for actionable information that you can use to address a specific business problem tied to a measurable revenue or cost-savings opportunity. Period.
You want to make money? Be a text analytics heretic. Follow these three principles of text analytics heresy:
- Beware of insights
- Think small
- Don’t get sentimental
Beware of insights
The promise of “insight” is so alluring. You read those brochures, and you start thinking, “I will look deep into my data, and she will reveal her innermost secrets to me, and me alone.” Oh, the sex appeal of it.
It’s tempting to believe you can approach data without a plan and extract pearls of wisdom, but that’s unrealistic. What’s the alternative? Before beginning a text analytics project, make the effort to select and quantify a specific business issue to address, and determine what information you require to address it.
It seems that every analytics tool is a Big Data solution these days. Much criticism focuses on the limitations of products in handling truly massive datasets. That criticism may be founded, but it misses the point. The mere fact that massive quantities of data are at hand doesn’t imply that there is something to be gained from using all of it to address any particular question, let alone whether it is cost-effective to do so.
Vendors push Big Data solutions for several reasons. It’s a hot concept right now, so it’s true that many prospects are talking Big. No vendor wants to disappoint you. And no vendor wants to appear less capable than the competition. Not to mention that Big Data calls for big resources, and justifies a big pricetag.
Those who are put off by high priced tools often respond by creating or using equivalents that are lower priced or free, and feel very superior about it. You read about them and their Big Data war stories in the tech press all the time. Of course, those stories leave out some little details. Like the fact that those “low-cost” alternatives depended on the availability of a lot of highly trained volunteer (or otherwise underpaid) talent. Will those people work free for you?
Use only as much data as justified to address a particular business need. If you’re buying a Big Data solution to make a pie chart, you’re a fool, but you’re not alone. Stop being silly! You want to build a predictive model? Great. In most cases, you don’t need to inhale every single bit of your data into an analytics tool to do that. Use a sample, for heaven’s sake. If you know nothing about sampling, pick up a book, or invest in a class, and learn. You can save a fortune in resources, and get results faster, that way.
Don’t get sentimental
Everybody wants sentiment analysis. At a certain level, that’s smart. Knowing how many people are mentioning your product (or any other topic) doesn’t mean much if you don’t also know something about what they are saying.
But assessing sentiment in text is a tricky business. Humans don’t agree with one another consistently when assessing sentiment of text. In fact, even a single person asked to assess the sentiment expressed in a particular bit of text on several occasions will often give different answers. It’s hard even to make a presentation on the topic, because the audience invariably gets caught up in picking over the individual cases and debating whether the assessments are acceptable. Where's the actionable insight in that?
Instead of sentiment categories, look for something better defined and more actionable in your data. Take the example of Paypal’s Han-Sheong Lai, who uses text analytics to identify customers with intent close their accounts. Does he look for broad categories of positive and negative sentiment? No. He looks for people saying things like, “I’m going to close my account.” You can bet that makes it a lot easier to accurately assess risk, and quantity results.
You want to be cool? Fine. Do whatever they talk about in the tech press. But if you want to make money with text analytics, be a heretic.
(image: text analytics / shutterstock)
Meta Brown is author of "Data Mining for Dummies" (forthcoming from John Wiley and Sons). She has introduced and expanded the use of analytics in offices and factories across the US and beyond. Got a question about promoting analytics? Or on using analytics? Just want to say hello? Email Meta at [email protected], tweet her @metabrown312 or visit http://www.metabrown.com