By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
    benefits of data analytics for financial industry
    Fascinating Changes Data Analytics Brings to Finance
    7 Min Read
    analyzing big data for its quality and value
    Use this Strategic Approach to Maximize Your Data’s Value
    6 Min Read
    data-driven seo for product pages
    6 Tips for Using Data Analytics for Product Page SEO
    11 Min Read
    big data analytics in business
    5 Ways to Utilize Data Analytics to Grow Your Business
    6 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Spam Detection in Social Data : A new business?
Share
Notification Show More
Latest News
cloud-centric companies using network relocation
Cloud-Centric Companies Discover Benefits & Pitfalls of Network Relocation
Cloud Computing
construction analytics
5 Benefits of Analytics to Manage Commercial Construction
Analytics
database compliance guide
Four Strategies For Effective Database Compliance
Data Management
Digital Security From Weaponized AI
Fortifying Enterprise Digital Security Against Hackers Weaponizing AI
Security
DevOps on cloud
Optimizing Cost with DevOps on the Cloud
Cloud Computing Development Exclusive IT
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Marketing > Spam Detection in Social Data : A new business?
Marketing

Spam Detection in Social Data : A new business?

ThemosKalafatis
Last updated: 2010/12/08 at 8:58 PM
ThemosKalafatis
4 Min Read
SHARE
- Advertisement -
All of us who use Twitter know the problem of spam Tweets. Spamming on Twitter can happen in several ways. For example spammers can use a trending topic to make their tweets visible (that also happen to have nothing to do with the current topic).

All of us who use Twitter know the problem of spam Tweets. Spamming on Twitter can happen in several ways. For example spammers can use a trending topic to make their tweets visible (that also happen to have nothing to do with the current topic). Other tweets, although they do not contain erroneous hash tags they contain uninteresting information.
In a previous example, Tweets were used to analyze the sentiment of Twitter users on U.S Economy. The study used several thousands of Tweets to extract insights. However between all tweets that originally discussed about the economy there were several spam Tweets such as “make money online even if the economy is bad”.
It is well known that the most time-consuming process in a Data / Text Mining project is pre-processing. Therefore when one wants to analyze tweets and extract knowledge from them, obviously one step is to remove spam and uninteresting Tweets to minimize the chances of GIGO.
Spam detection in Tweets -and Social Media unstructured data in general- is a difficult task. It requires “concept-aware” analysis of Text. One of the interesting facets of analytics is the ability to solve the same problem in several ways, or -perhaps even better- to combine all available tools to reach a better solution.
There is an ever growing number of companies that analyze Social Media Data and erroneous data may be seriously altering their insights – even if millions of records are available. Perhaps in the very near future, providing cleaned social media data to analytic companies or other information consumers could be a business in its own.
It is possible to perform spam detection in many ways : Using machine learning methods is one : In other words, training a classifier to sift through -say- hundreds of thousands of tweets that are marked accordingly as “spam” or “no-spam”. We could use a more elaborate methodology to actually build and define rules by non-automatic methods that characterize spam Tweets. We could even consider other information such as who Tweeted, how many followers this user has or how often ‘@’ is used to address other users. Once again, problem representation and how / which algorithms are used should be carefully selected.
Spam detection in Social Media Data is one of the problems that will become more important as more analytic companies are created. Detecting interesting information is another area to watch. People want real insights.
In the previous post, tweets were used to identify what people want / feel / don’t like when they visit a shopping mall. While analyzing this information it was found that word ‘Omaha’ was associated with the word “Mall”. Under close inspection i realized that “Omaha Mall” is a song by Justin Bieber. Of course i am not suggesting that these Tweets about Justin’s song were spam but they had nothing to do with the purpose of the analysis. Could an automated technique identify this inconsistency and suggest to filter out this information? Being able to automatically select the right information will probably become more important as text information increases and a fast, correct and actionable intelligence becomes a necessity.

- Advertisement -
TAGGED: Spam
ThemosKalafatis December 8, 2010
Share this Article
Facebook Twitter Pinterest LinkedIn
Share
- Advertisement -

Follow us on Facebook

Latest News

cloud-centric companies using network relocation
Cloud-Centric Companies Discover Benefits & Pitfalls of Network Relocation
Cloud Computing
construction analytics
5 Benefits of Analytics to Manage Commercial Construction
Analytics
database compliance guide
Four Strategies For Effective Database Compliance
Data Management
Digital Security From Weaponized AI
Fortifying Enterprise Digital Security Against Hackers Weaponizing AI
Security

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

Another “anti-spam litigant” Goes Down in California: Domain Use Challenged

3 Min Read

Guest Post: A Plan For Abusiveness

9 Min Read

Facebook Blocks Spammers with Restraining Order

4 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Analytics Big Data Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?