The future and trends of Text Analytics

May 19, 2010
50 Views
I recently attended a GATE seminar on the University of Sheffield. Having used GATE for quite some time now, i was happy to see that the GATE team is well committed to developing the GATE Text Analysis Workbench by constantly adding more functionality.
Although many of the participants were PhD students i was also happy to see people from companies that now wish to leverage the hidden knowledge that exists in unstructured text. Whether it was analysis on text of Patents information, intelligent search on Text of Photo Captions for a large News Agency or understanding what a customer wants, Text Analytics are becoming an important tool for making better decisions.
I also had the opportunity to speak with several people about the future of Text Analytics. What are we likely to see happening in the next years on Information Extraction and Text Analytics?
First we have to understand how Text Analytics deliver results. In order for a computer to ‘understand’ unstructured text, it should be ‘taught’ that the word ‘Dollar’ is a currency of a country that is called ‘US’ and also that US, United States, USA and U.S.A is the same concept. This means that

I recently attended a GATE seminar on the University of Sheffield. Having used GATE for quite some time now, i was happy to see that the GATE team is well committed to developing the GATE Text Analysis Workbench by constantly adding more functionality.
Although many of the participants were PhD students i was also happy to see people from companies that now wish to leverage the hidden knowledge that exists in unstructured text. Whether it was analysis on text of Patents information, intelligent search on Text of Photo Captions for a large News Agency or understanding what a customer wants, Text Analytics are becoming an important tool for making better decisions.
I also had the opportunity to speak with several people about the future of Text Analytics. What are we likely to see happening in the next years on Information Extraction and Text Analytics?
First we have to understand how Text Analytics deliver results. In order for a computer to ‘understand’ unstructured text, it should be ‘taught’ that the word ‘Dollar’ is a currency of a country that is called ‘US’ and also that US, United States, USA and U.S.A is the same concept. This means that hundreds of thousands of concepts and synonyms have to be specified so that a computer identifies them in unstructured text. This process is called Text Annotation.
The Golden Standard of Text Annotation is annotations done by humans : A computer sifts through the text of a web page, annotates it with concepts and then these annotations are checked against annotations made by humans on the same text to assess the accuracy with which a computer ‘understands’ this text and the concepts and entities that exist in it.
So what does the future hold? First of all, since unstructured text becomes more available there will be a greater need for ‘annotation farms’ : Groups of people who will be manually annotating free text, identifying an ever-growing number of Companies, Managers, Politician names, or anything else that has to be ‘taught’ to a computer. Note that Annotation Farms already exist but the need for this service will become greater.
The second trend on Text Analytics could be something equivalent to what we have seen happening with NetFlix. Suppose that you own a company that produces Brand ‘X’ and you wish to track the reputation of your product online. You would then submit a sample of your product’s mentions to various companies that analyze text and have them compete against each other in terms of -for example- Precision and Recall. The one that produces consistently the best metrics (whether Precision – Recall, Kappa statistic or F-Measure) will also get the job.
A third trend could be the development of text analytics for specific concepts : Sentiment Analysis and Named Entity recognition is hard work if one wants to produce sound and accurate results. So it could be probable that Text Analytics experts will choose a specific concept -For example reputation of Banks- and then work in the analysis of this -very specific- concept so that they achieve better metrics.

Link to original post

You may be interested

How SAP Hana is Driving Big Data Startups
Big Data
298 shares2,909 views
Big Data
298 shares2,909 views

How SAP Hana is Driving Big Data Startups

Ryan Kh - July 20, 2017

The first version of SAP Hana was released in 2010, before Hadoop and other big data extraction tools were introduced.…

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion
Data Management
42 views
Data Management
42 views

Data Erasing Software vs Physical Destruction: Sustainable Way of Data Deletion

Manish Bhickta - July 20, 2017

Physical Data destruction techniques are efficient enough to destroy data, but they can never be considered eco-friendly. On the other…

10 Simple Rules for Creating a Good Data Management Plan
Data Management
69 shares623 views
Data Management
69 shares623 views

10 Simple Rules for Creating a Good Data Management Plan

GloriaKopp - July 20, 2017

Part of business planning is arranging how data will be used in the development of a project. This is why…