By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics in sports industry
    Here’s How Data Analytics In Sports Is Changing The Game
    6 Min Read
    data analytics on nursing career
    Advances in Data Analytics Are Rapidly Transforming Nursing
    8 Min Read
    data analytics reveals the benefits of MBA
    Data Analytics Technology Proves Benefits of an MBA
    9 Min Read
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Dr Gates was right, or how I learned to stop worrying and love the spam
Share
Notification Show More
Latest News
data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Dr Gates was right, or how I learned to stop worrying and love the spam
Data MiningPredictive Analytics

Dr Gates was right, or how I learned to stop worrying and love the spam

DavidMSmith
Last updated: 2009/04/16 at 3:41 PM
DavidMSmith
6 Min Read
SHARE

In 2004 Microsoft founder (and honorary doctorate recipient) Bill Gates confidently stated that “Spam will soon be a thing of the past.” It’s now five years later (Gates suggested the problem would be solved in two), and spam is now 95% of all emails sent. Nonetheless, I think Gates was mostly right in principle even if the timeline was optimistic. A decade ago, when email spam was a real problem, I took care not to let my email address be displayed in public. Spammers had a habit of scraping email addresses from web-sites, with automated robots crawling the web looking…

In 2004 Microsoft founder (and honorary doctorate recipient) Bill Gates confidently stated that "Spam will soon be a thing of the past." It's now five years later (Gates suggested the problem would be solved in two), and spam is now 95% of all emails sent. Nonetheless, I think Gates was mostly right in principle even if the timeline was optimistic.

A decade ago, when email spam was a real problem, I took care not to let my email address be displayed in public. Spammers had a habit of scraping email addresses from web-sites, with automated robots crawling the web looking for any text containing the @-symbol. Despite my efforts, I had to abandon a couple of email addresses after they got added to the mailing lists traded between spammers, and the noise overwhelmed the signal in my inbox.

More Read

data mining helps with offsite SEO

Can Data Mining Aid with Off-Page SEO Strategies?

Albanian Bitcoin Investors Tap the Power of Predictive Analytics
Predictive Analytics Improves Trading Decisions as Euro Rebounds
Can Predictive Analytics Help Traders Navigate Bitcoin’s Volatility?
Perks of Predictive Analytics for Businesses Big and Small
That was before the advent of good spam filters, though, which have greatly improved in the last couple of years. I now use Google Mail for all my mail, which has excellent spam-filtering technology. Even my non-Google addresses are forwarded to a gmail account, which I can rely on to filter the crap so that I can see the emails I actually care about.

I started my current job about 9 months ago now, and I made a conscious decision to stop worrying about spam and let my email address — david@revolution-computing.com — be free. It's linked directly on every page of this blog and on the REvolution Computing website, and I don't hesitate to include it in other public venues. It's been out there long enough to be picked up by robots and web searches, so it's probably time to evaluate the results. I'd say it's a success, and I'm very glad I took the plunge. I maybe get 2 spam emails a week in my Google Mail account (faithfully tucked away in my Spam folder), and better yet I don't think I've lost any legitimate mail to the spam filter. (So if you've emailed me and I haven't replied, I have only myself to blame. My apologies – I do get a lot of legitimate email.) I don't use any other email services so I can't speak to the performance of their spam filters, but I'm happy with my results.

So what changed between 2004 and now? My guess is that it's mainly been the transition to web-based email services. Statisticians have attempted to solve the spam problem before with predictive models, but the results were never that great at the time. The problem was likely twofold: it's a highly asymmetrical problem, where a false positive is a much bigger problem than a false negative, but too many false negatives mean the filter isn't really useful in practice. Secondly, I think the corpus was simply too small: a few hundred thousand emails, or even all the emails for all the employees of a largish company with a central email server, simply isn't going to result in a filter that gives a clean inbox while not trashing any legitimate mail sent to a broad community of users.

Web-based email certainly solves the corpus-size problem, but there's one additional detail that I expect makes it work. The defining feature of spam is that a spam email is sent to lots and lots of people and a web-based email service like Google Mail can easily see when a duplicate email is sent to lots and lots of users at the same time. Spammers have attempted various tricks to make that process more difficult — converting text to images, or adding random text to each mail to make it harder to detect duplicates — but Google seems to have largely overcome these hurdles.

So then, is the spam problem solved? At a technical level, clearly not — spam still consumes a tremendous amount of bandwidth and costs billions of dollars to contain — but at the personal level it's hardly more than a minor irritant these days. (And if it's not for you, consider a new email service.) For individuals, the real spam problem these days lies in other venues: social networking spam, blog spam, link farms, and so on. Mr Gates, when can we expect solutions to those problems? 

DavidMSmith April 16, 2009
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

data mining helps with offsite SEO
Data Mining

Can Data Mining Aid with Off-Page SEO Strategies?

10 Min Read
predictive analytics helps Albanian bitcoin investors
Blockchain

Albanian Bitcoin Investors Tap the Power of Predictive Analytics

9 Min Read
benefits of data analytics for financial management
Predictive Analytics

Predictive Analytics Improves Trading Decisions as Euro Rebounds

10 Min Read
predictive analytics can help bitcoin traders predict future price movements
Blockchain

Can Predictive Analytics Help Traders Navigate Bitcoin’s Volatility?

8 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?