By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    Promising Benefits of Predictive Analytics in Asset Management
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Clustering the thoughts of Twitter Users
Share
Notification Show More
Latest News
ai in marketing with 3D rendering
Marketers Use AI to Take Advantage of 3D Rendering
Artificial Intelligence
How Big Data Is Transforming the Maritime Industry
How Big Data Is Transforming the Maritime Industry
Big Data
ai digital marketing tools
Top Five AI-Driven Digital Marketing Tools in 2023
Artificial Intelligence
ai-generated content
Is AI-Generated Content a Net Positive for Businesses?
Artificial Intelligence
predictive analytics in dropshipping
Predictive Analytics Helps New Dropshipping Businesses Thrive
Predictive Analytics
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Clustering the thoughts of Twitter Users
Uncategorized

Clustering the thoughts of Twitter Users

ThemosKalafatis
Last updated: 2009/01/25 at 10:39 PM
ThemosKalafatis
5 Min Read
SHARE

During the last two posts i presented the reasons and some problems on analyzing the thoughts of users on the web and particularly Twitter. (For more see Part1 and Part2 ).

As an example, we are going to be looking at a specific kind of thought that Twitter users make : What they don’t want. So let us start : By using the Twitter API i managed to extract all tweets having the phrase “i don’t want to”. The following text file shows the results :

The next step is to remove all phrases that do not give us any information about what users do not want :

More Read

big data improves

3 Ways Big Data Improves Leadership Within Companies

IT Is Not Analytics. Here’s Why.
Romney Invokes Analytics in Rebuke of Trump
WEF Davos 2016: Top 100 CEO bloggers
In Memoriam: Robin Fray Carey


Finally we remove the phrase “i don’t want to”. However, consider the following example:

“I must go to Chicago. I don’t want to do that”

The steps discussed above will discard the first sentence which is actually what the user does not want to do and leave only the phrase “i don’t want to do that” which is not particularly informative. At this point we must quantify the problem -let’s assume it involves the 8.5% of our records- and recall what the pareto principle is all about.

After some additional pre-processing steps which are not discussed here, i feed the data to K-Means to see the clusters the algorithm comes up with. For a better pres…

During the last two posts i presented the reasons and some problems on analyzing the thoughts of users on the web and particularly Twitter. (For more see Part1 and Part2 ).

As an example, we are going to be looking at a specific kind of thought that Twitter users make : What they don’t want. So let us start : By using the Twitter API i managed to extract all tweets having the phrase “i don’t want to”. The following text file shows the results :

The next step is to remove all phrases that do not give us any information about what users do not want :


Finally we remove the phrase “i don’t want to”. However, consider the following example:

“I must go to Chicago. I don’t want to do that”

The steps discussed above will discard the first sentence which is actually what the user does not want to do and leave only the phrase “i don’t want to do that” which is not particularly informative. At this point we must quantify the problem -let’s assume it involves the 8.5% of our records- and recall what the pareto principle is all about.

After some additional pre-processing steps which are not discussed here, i feed the data to K-Means to see the clusters the algorithm comes up with. For a better presentation of the results, here is a screen capture from IBM’s UI Modeler :


We immediately see -in descending order- what Tweeter users do not want :

1) They don’t want to go to work
2) They don’t want to go to school
3) They don’t want to hear about various issues
4) They don’t want to stay home

Notice also the top two categories named Miscellaneous and None. These categories contain thoughts that have a very small frequency to form a cluster. These two categories consist the 69.56% of our records and at this point we should think again about the pareto principle.

Please note that not all necessary work is discussed here and i had to omit several actions that have to take place. In trying to understand what people actually think i am using an approach which uses Ontologies, Information Extraction, Clustering and Classification analysis with the ultimate goal to minimize the percentage of thoughts (69.56% in this example) that cannot form a cluster and to increase the accuracy of the analysis.

It is also an interesting fact that we could move further down the sentence branch (see this post) for even better insight. Here i presented a clustering analysis about what users do not want. As an example we could apply clustering on user thoughts for “I don’t want to feel”.

Link to original post

ThemosKalafatis January 25, 2009
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai in marketing with 3D rendering
Marketers Use AI to Take Advantage of 3D Rendering
Artificial Intelligence
How Big Data Is Transforming the Maritime Industry
How Big Data Is Transforming the Maritime Industry
Big Data
ai digital marketing tools
Top Five AI-Driven Digital Marketing Tools in 2023
Artificial Intelligence
ai-generated content
Is AI-Generated Content a Net Positive for Businesses?
Artificial Intelligence

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

[mc4wp_form id=”1616″]

You Might also Like

big data improves
Big DataJobsKnowledge ManagementUncategorized

3 Ways Big Data Improves Leadership Within Companies

6 Min Read
Image
Uncategorized

IT Is Not Analytics. Here’s Why.

7 Min Read

Romney Invokes Analytics in Rebuke of Trump

4 Min Read

WEF Davos 2016: Top 100 CEO bloggers

14 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?