Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    media monitoring
    Signals In The Noise: Using Media Monitoring To Manage Negative Publicity
    5 Min Read
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Clustering the thoughts of Twitter Users
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Clustering the thoughts of Twitter Users
Uncategorized

Clustering the thoughts of Twitter Users

ThemosKalafatis
ThemosKalafatis
5 Min Read
SHARE

During the last two posts i presented the reasons and some problems on analyzing the thoughts of users on the web and particularly Twitter. (For more see Part1 and Part2 ).

As an example, we are going to be looking at a specific kind of thought that Twitter users make : What they don’t want. So let us start : By using the Twitter API i managed to extract all tweets having the phrase “i don’t want to”. The following text file shows the results :

The next step is to remove all phrases that do not give us any information about what users do not want :


Finally we remove the phrase “i don’t want to”. However, consider the following example:

“I must go to Chicago. I don’t want to do that”

The steps discussed above will discard the first sentence which is actually what the user does not want to do and leave only the phrase “i don’t want to do that” which is not particularly informative. At this point we must quantify the problem -let’s assume it involves the 8.5% of our records- and recall what the pareto principle is all about.

After some additional pre-processing steps which are not discussed here, i feed the data to K-Means to see the clusters the algorithm comes up with. For a better pres…

During the last two posts i presented the reasons and some problems on analyzing the thoughts of users on the web and particularly Twitter. (For more see Part1 and Part2 ).

More Read

Understanding your market
The Computing Deployment Phase
MDM and M&A
DQ-Tip: “Don’t pass bad data on to the next person…”
It has all been done B4
As an example, we are going to be looking at a specific kind of thought that Twitter users make : What they don’t want. So let us start : By using the Twitter API i managed to extract all tweets having the phrase “i don’t want to”. The following text file shows the results :

The next step is to remove all phrases that do not give us any information about what users do not want :


Finally we remove the phrase “i don’t want to”. However, consider the following example:

“I must go to Chicago. I don’t want to do that”

The steps discussed above will discard the first sentence which is actually what the user does not want to do and leave only the phrase “i don’t want to do that” which is not particularly informative. At this point we must quantify the problem -let’s assume it involves the 8.5% of our records- and recall what the pareto principle is all about.

After some additional pre-processing steps which are not discussed here, i feed the data to K-Means to see the clusters the algorithm comes up with. For a better presentation of the results, here is a screen capture from IBM’s UI Modeler :


We immediately see -in descending order- what Tweeter users do not want :

1) They don’t want to go to work
2) They don’t want to go to school
3) They don’t want to hear about various issues
4) They don’t want to stay home

Notice also the top two categories named Miscellaneous and None. These categories contain thoughts that have a very small frequency to form a cluster. These two categories consist the 69.56% of our records and at this point we should think again about the pareto principle.

Please note that not all necessary work is discussed here and i had to omit several actions that have to take place. In trying to understand what people actually think i am using an approach which uses Ontologies, Information Extraction, Clustering and Classification analysis with the ultimate goal to minimize the percentage of thoughts (69.56% in this example) that cannot form a cluster and to increase the accuracy of the analysis.

It is also an interesting fact that we could move further down the sentence branch (see this post) for even better insight. Here i presented a clustering analysis about what users do not want. As an example we could apply clustering on user thoughts for “I don’t want to feel”.

Link to original post

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

NO-CODE
Breaking down SPARC Emulation Technology: Zero Code Re-write
Exclusive News Software
online business using analytics
Why Some Businesses Seem to Win Online Without Ever Feeling Like They Are Trying
Exclusive News
edi compliance with AI
AI Is Transforming EDI Compliance Services
Exclusive News
companies using big data
5 Industries Driving Big Data Technology Growth
Big Data Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

Alternative Bidding

4 Min Read

Quasi-Property Rights: Associated Press and the “Hot News” Doctrine

3 Min Read

Those who Tweet

2 Min Read

Rate Your Local Police Online

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?