Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Practical Sentiment Analysis and Lies
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Data Management > Culture/Leadership > Practical Sentiment Analysis and Lies
AnalyticsCommentaryCulture/LeadershipData QualityPolicy and Governance

Practical Sentiment Analysis and Lies

TomAnderson
TomAnderson
10 Min Read
SHARE

Q&A with Prof. Bing Liu ahead of the Sentiment Analysis Symposium and Pre Symposium Tutorial

Tom: Bing, how did you get into text analytics, and sentiment analysis?

Q&A with Prof. Bing Liu ahead of the Sentiment Analysis Symposium and Pre Symposium Tutorial

Tom: Bing, how did you get into text analytics, and sentiment analysis?

Bing: My earlier research interests were in the areas of data mining and machine learning. In about year 2000, I started to get interested in Web mining and machine learning using text data. These two topics led me to the text on the Web. Reviews naturally come to mind because they are focused and well organized, which is great for data mining. I also quickly realized that sentiment analysis was a perfect research problem on its own (I called it opinion mining then due to my data mining background). It had so many applications as every individual and organization needs opinions for decision making. There was also a whole range of challenging research problems that had not been addressed by the natural language processing or the linguistics communities. We started to work on it in 2003 and published our first paper in KDD-2004 (ACM SIGKDD International Conference on Knowledge Discovery and Data Mining). The paper basically defined the framework of feature or aspect-based sentiment analysis and opinion summarization, which is now widely used in the industry and in research.

Tom: False website reviews are an interesting application, and one that I’ve been keeping my eye on. I noticed the New York Times recently covered some of your work in this area. This type of text analytics research seems to be much more difficult than most people think. Can you tell us a bit about this problem from the text analytics perspective, and how it is different from simpler use cases like identifying spam email for instance?

Bing: Indeed, this is a very difficult problem. My group began to work on it in around 2006 or 2007 as we realized this was an important problem and would become more and more important. When we started to do it, we realized it was really hard. The main difficulty lies in the fact that it is very hard, if not impossible, to recognize fake reviews manually as it is fairly easy to craft a fake review and pose it as a genuine one. Email spam detection is a much easier problem because you will immediately recognize a spam mail when you see one. This means that spam and non-spam emails have clear differences, and that it is easy to produce training data for machine learning algorithms in order to produce predictive models and to evaluate the models.

However, for fake reviews, if one writes them very carefully, it is hard to recognize them just by reading the review text. In the extreme case, this is an impossible task logically. For example, one can write a genuine review for a good restaurant and post it as a fake review for a bad restaurant in order to promote the bad restaurant. There is no way to detect this fake review without considering information beyond the review text itself simply because one review cannot be both truthful and fake at the same time.

Tom: What do you see as some of the applications of this type of research?

Bing: Review hosting sites or any general social media sites all want their reviews and user comments to be trustworthy. They are thus interested in fake review detection algorithms. All text analytics systems that use reviews or any opinion data need to worry about this problem too. Social media is here to stay. Its content is also being used more and more in applications.

Something has to be done to ensure the integrity of this valuable source of information before it becomes full of fake opinions, lies and deceptive information. After all, there are strong motivations for businesses and individuals to post fake reviews for profit and fame. It is also easy and cheap to do so. Writing fake reviews has already become a very cheap way of marketing and product promotion.

Tom: Have you found there are certain approaches that work better than others?

Bing: It is still too early to tell. Researchers currently use both linguistic features and atypical behaviors of reviewers to detect fakes. I feel that algorithms that mine atypical behaviors of reviewers and reviews tend to produce more interpretable and trustworthy results. For example, if all 5-star reviews for a hotel were posted only by people from the surrounding area of the hotel, these reviews are clearly suspicious. This is a simple example. More sophisticated fake reviews need more involved modeling and algorithms to detect them.

Tom: It’s been my observation and experience that we as an industry are moving away from linguistic approach to text (sure, some of the basics are useful), but machine learning and statistical approaches seem more powerful. What are your thoughts on this?

Bing: For most tasks, machine learning and statistical approaches are indeed more effective than pure linguistic based approaches. Linguistic approaches are mostly based on heuristic rules and patterns (including grammar information). For those tasks that can be performed based on words, it is very hard for a linguistics based approach to beat a statistical machine learning algorithm simply because the signals used by a machine learning algorithm are far more numerous than the rules or patterns that a human person can design. Plus, machine learning algorithms optimize the performances. However, that being said, in many tasks, linguistics based signals and clues are used as features by machine learning algorithms.

Statistical approaches are not without their limits. Going forward, I believe that both linguistic knowledge and statistical modeling are important. We are working on integrating more linguistic knowledge into statistical modeling.

Tom: It seems to me a lot of folks get a little too caught up in differences between languages. My firm for instance has found it rather easy to add other European languages to our approach, and of course machine translation is always a possibility. What are your thoughts on this?

Yes, I agree. Although every language is different, different languages are still similar as they all consist of words and grammar. European languages have even more similarities due to their common roots. A learning algorithm can capture many types of grammar regularities from any language if there is a sufficient amount of training data. For those tasks that need only word or lexical information, the same algorithm can be used for any language with almost no modification because an algorithm treats words are symbols. In that sense, it does not matter what language it is.

Tom: What will you be covering during the tutorial at the sentiment symposium?

Bing: Sentiment analysis has been studied extensively for the past decade. A huge number of research papers have been published on it (probably more than 1000). It is impossible to cover them all. Therefore, I will try to cover the main threads of research that also contain aspects which can be of immediate use in practice.

In the tutorial, I will start with a short motivation and then go on to define the problem. This will provide an abstraction or statement of the problem, which will naturally introduce the key sub-problems. I will then discuss the current state-of-the-art approaches to solving these problems. Since this is a practical sentiment analysis tutorial, I will also describe how to build a practical sentiment analysis system based on my previous experience in building one. In the final part of the tutorial, I will introduce the problem of fake review detection.

 

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Why Nobody Is Actually Analyzing Unstructured Data

5 Min Read

Burning Microwave!!! (Mozy online backup) (via ijustine)

0 Min Read

6 reasons why your business cannot succeed without predictive analytics

7 Min Read

Analysts Don’t Get No Respect – SDC Blogarama topic for November 14

2 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?