Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
    data analytics and gold trading
    Data Analytics and the New Era of Gold Trading
    9 Min Read
    composable analytics
    How Composable Analytics Unlocks Modular Agility for Data Teams
    9 Min Read
    data mining to find the right poly bag makers
    Using Data Analytics to Choose the Best Poly Mailer Bags
    12 Min Read
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Clustering the thoughts of Twitter Users
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Clustering the thoughts of Twitter Users
Uncategorized

Clustering the thoughts of Twitter Users

ThemosKalafatis
ThemosKalafatis
5 Min Read
SHARE

During the last two posts i presented the reasons and some problems on analyzing the thoughts of users on the web and particularly Twitter. (For more see Part1 and Part2 ).

As an example, we are going to be looking at a specific kind of thought that Twitter users make : What they don’t want. So let us start : By using the Twitter API i managed to extract all tweets having the phrase “i don’t want to”. The following text file shows the results :

The next step is to remove all phrases that do not give us any information about what users do not want :


Finally we remove the phrase “i don’t want to”. However, consider the following example:

“I must go to Chicago. I don’t want to do that”

The steps discussed above will discard the first sentence which is actually what the user does not want to do and leave only the phrase “i don’t want to do that” which is not particularly informative. At this point we must quantify the problem -let’s assume it involves the 8.5% of our records- and recall what the pareto principle is all about.

After some additional pre-processing steps which are not discussed here, i feed the data to K-Means to see the clusters the algorithm comes up with. For a better pres…

During the last two posts i presented the reasons and some problems on analyzing the thoughts of users on the web and particularly Twitter. (For more see Part1 and Part2 ).

More Read

Surprising Email Study
Problems uploading or converting your file in ADAPA?
More Conservative
The evolving nature of IT partnerships
Making Ads More Interesting…for Users or for Google?
As an example, we are going to be looking at a specific kind of thought that Twitter users make : What they don’t want. So let us start : By using the Twitter API i managed to extract all tweets having the phrase “i don’t want to”. The following text file shows the results :

The next step is to remove all phrases that do not give us any information about what users do not want :


Finally we remove the phrase “i don’t want to”. However, consider the following example:

“I must go to Chicago. I don’t want to do that”

The steps discussed above will discard the first sentence which is actually what the user does not want to do and leave only the phrase “i don’t want to do that” which is not particularly informative. At this point we must quantify the problem -let’s assume it involves the 8.5% of our records- and recall what the pareto principle is all about.

After some additional pre-processing steps which are not discussed here, i feed the data to K-Means to see the clusters the algorithm comes up with. For a better presentation of the results, here is a screen capture from IBM’s UI Modeler :


We immediately see -in descending order- what Tweeter users do not want :

1) They don’t want to go to work
2) They don’t want to go to school
3) They don’t want to hear about various issues
4) They don’t want to stay home

Notice also the top two categories named Miscellaneous and None. These categories contain thoughts that have a very small frequency to form a cluster. These two categories consist the 69.56% of our records and at this point we should think again about the pareto principle.

Please note that not all necessary work is discussed here and i had to omit several actions that have to take place. In trying to understand what people actually think i am using an approach which uses Ontologies, Information Extraction, Clustering and Classification analysis with the ultimate goal to minimize the percentage of thoughts (69.56% in this example) that cannot form a cluster and to increase the accuracy of the analysis.

It is also an interesting fact that we could move further down the sentence branch (see this post) for even better insight. Here i presented a clustering analysis about what users do not want. As an example we could apply clustering on user thoughts for “I don’t want to feel”.

Link to original post

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

street address database
Why Data-Driven Companies Rely on Accurate Street Address Databases
Big Data Exclusive
predictive analytics risk management
How Predictive Analytics Is Redefining Risk Management Across Industries
Analytics Exclusive Predictive Analytics
data analytics and gold trading
Data Analytics and the New Era of Gold Trading
Analytics Big Data Exclusive
student learning AI
Advanced Degrees Still Matter in an AI-Driven Job Market
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

A First Taste of Dogfood

8 Min Read

Embracing Socialytics

5 Min Read

In-Memory Analytic Databases Are so Last Century

16 Min Read

Just-in-time training at the desktop

2 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?