There’s always a food angle, even in text analytics

November 11, 2008
57 Views

Text analytics was one of those things I heard about every so often. Like so many terms in this business, the term comes out of a speaker’s mouth or PR person’s press release only to blow away. There’s no story, no context, nothing to chew on.

Then came a press release at BI This Week with a rare combination: surprise and concreteness. It said text analytics would help with food safety. I’m all for food, but I had no idea what text analytics had

Text analytics was one of those things I heard about every so often. Like so many terms in this business, the term comes out of a speaker’s mouth or PR person’s press release only to blow away. There’s no story, no context, nothing to chew on.

Then came a press release at BI This Week with a rare combination: surprise and concreteness. It said text analytics would help with food safety. I’m all for food, but I had no idea what text analytics had to do with it.

I emailed UK-based Linguamatics, publisher of the nifty tool they call I2E. What’s this I hear about food? Product manager Phil Hastings, ready to call it a day in Croatia, called to explain the features to me, barely post-breakfast and not fully verbal. I2E was indeed a powerful little thing, but I still didn’t get the food angle.

It wasn’t until I got William Hayes on the phone that things started making sense. He’s director of library and literature informatics at pharmaceutical research company Biogen Idec. They don’t do food, but close enough.

If you think the Sunday New York Times is enough for one day, consider what the research community has to bear. Hayes says, “If you’ve got 20 million articles to read, where do you start?’

“The research industry works under a tougher knowledge model than terrorist intelligence gathering,” says Hayes. “Our ability to tap that ocean of literature is like dropping a line into the ocean for fish.”

In general, a scientist can read 150 to 200 full text journal articles a year, he explains. A curator can review about 100 abstracts a day “for a few days before you start going nuts.” Text mining is the only way to keep up with the ocean of literature produced each year.

The food industry fries potatoes, but it also has to keep a lookout on research.

TNO information analyst Fred van de Brug told me the acrylamide story: Most people in the food industry missed the first warning. Scientists had published a discovery in 2000 about a carcinogen known as acrylamide, which can develop in starch-rich foods like potatoes as they are fried. By the time the danger finally hit the public media in 2002, millions of people had been exposed unnecessarily. Text mining would have helped.

I2E is more agile than standard text mining. You can learn to use it in a few hours. Hayes told me, “If you can remember bits of grammar and have some concept of what you’re researching, it’s a piece of cake.”

It’s a story in progress for BI This Week.

Link to original post

You may be interested

How SAP Hana is Driving Big Data Startups
Big Data
298 shares3,039 views
Big Data
298 shares3,039 views

How SAP Hana is Driving Big Data Startups

Ryan Kh - July 20, 2017

The first version of SAP Hana was released in 2010, before Hadoop and other big data extraction tools were introduced.…

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion
Data Management
57 views
Data Management
57 views

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion

Manish Bhickta - July 20, 2017

Physical Data destruction techniques are efficient enough to destroy data, but they can never be considered eco-friendly. On the other…

10 Simple Rules for Creating a Good Data Management Plan
Data Management
69 shares654 views
Data Management
69 shares654 views

10 Simple Rules for Creating a Good Data Management Plan

GloriaKopp - July 20, 2017

Part of business planning is arranging how data will be used in the development of a project. This is why…