Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    How a Specialized Marketing VA Improves Campaign Analytics
    How a Specialized Marketing VA Improves Campaign Analytics
    11 Min Read
    New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
    New Data Analytics Breakthroughs Give eCommerce Startups a Fighting Chance
    6 Min Read
    How Data Analytics Is Reshaping Patient Financing Decisions
    How Data Analytics Is Reshaping Patient Financing Decisions
    13 Min Read
    business using business intelligence
    How to Use a Competitive Intelligence Dashboard to Turn Market Data Into Smarter Marketing Decisions 
    9 Min Read
    unusual trading activity
    Signal Or Noise? A Decision Tree For Evaluating Unusual Trading Activity
    3 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Support Vector Clustering: An Approach to Overcome the Limits of K-means
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Visualization > Support Vector Clustering: An Approach to Overcome the Limits of K-means
AnalyticsData Visualization

Support Vector Clustering: An Approach to Overcome the Limits of K-means

cristian mesiano
cristian mesiano
5 Min Read
SHARE

Some time ago, I posted a banal case to show the limits of K-mean clustering. A follower gave us a grid of different clustering techniques (calling internal routines of Mathematica) to solve the case discussed.

As you know, I like to write by myself the algorithms and I like to show alternative paths, so I’ve decided to explain a powerful clustering algorithm based on the SVM.

Some time ago, I posted a banal case to show the limits of K-mean clustering. A follower gave us a grid of different clustering techniques (calling internal routines of Mathematica) to solve the case discussed.

More Read

Visualization the Key to Grown Up Business Intelligence
Consumerization Revisited: Why Aesthetics Matter
Wordle Beautiful Word Clouds
The Data Analytics of Thanksgiving
Tracking All The Things You Need To Analyze

As you know, I like to write by myself the algorithms and I like to show alternative paths, so I’ve decided to explain a powerful clustering algorithm based on the SVM.

To understand the theory behind SVC (support vector clustering) I strongly recommend  you have a look at: http://jmlr.csail.mit.edu/papers/volume2/horn01a/rev1/horn01ar1.pdf . In this paper you will find all of the technical details explained with extremely clarity.

As usual I leave the theory to the books and I jump into the pragmatism of the real world.

Consider the problem of a set of points having an ellipsoid distribution: we have seen in the past that K-means doesn’t work in this scenario, and even trying different tweaks changing the position of the centroids and its number of centroids, the final result is always unacceptable.

SVC is a clustering algorithm that takes as input just two parameters (C and q) both of them real numbers. C is to manage the outliers and q is to manage the number of clusters. Be aware that q is not directly related with the number of clusters!! Tuning q  you can manage the “cluster granularity” but you cannot decide a priori the number of clusters returned by the algo.


How to implement SVC.
There are many implementations of SVC, but I would like to show different tools (I love broadening the horizons…), so the ingredients of the daily recipe are: AMPL & SNOPT.

Both of them are commercial tools but to play with small set of points (no more than 300) you can use for free the student license!

AMPL is a comprehensive and powerful algebraic modeling language for linear and nonlinear optimization problems, in discrete or continuous variables and SNOPT is a software package for solving large-scale optimization problems (linear and nonlinear programs).

AMPL allows the user to write the convex problem associated to SVC’s problem in easy way:

The AMPL file for SVC

And SNOPT is one of the many solvers ables to work with AMPL.

In the former image, after the statement “param x: 1  2   3   :=” there are the list of 3D points belonging to our data set.
One of the characteristics of SVC is the vector notation: it allows to work with high dimensions without changes in the development of the algorithm.
2D Problem 
Let’s show the application of SVC in our ellipsoid data set
300 pt having ellipsoid distribution.  The first contour of SVC  has been depicted in black.   
The above image shows the clusters (depicted like connected components of a graph…read further details in the mentioned paper) returned by SVC and plotted by Mathematica.

3D problem
Just to show the same algorithm working in 3D on the same problem:

3D points having ellipsoid distribution.
And here are the SVC results plotted by Mathematica:
SVC applied on the former data set
As you can see in both scenarios SVC is able to solve the easy problem that K means cannot manage.
PS
We will continue the text categorization in the next post… From time to time I allow to myself some divagation. 


Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

The End of Unstructured Marketing: Forcing Generative AI into Strict HTML Schemas
The End of Unstructured Marketing: Forcing Generative AI into Strict HTML Schemas
Artificial Intelligence Exclusive
How a Specialized Marketing VA Improves Campaign Analytics
How a Specialized Marketing VA Improves Campaign Analytics
Analytics Exclusive
ai marketing tools
The 9 AI Tools Marketers Use to Create Images and Video in 2026
Artificial Intelligence Exclusive
ai chatbot
How AI Website Chatbots Improve Customer Support and Lead Generation
Chatbots Exclusive

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

HealthMiner, an application that analyzes patient data, was…

1 Min Read

REvolution Computing is Hiring

2 Min Read
Image
AnalyticsBig DataJobs

How Many Quantitative Teams Are Actually Hiring?

3 Min Read

Guest Blogger: Len Tashman Previews Fall 2012 Issue of Foresight

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-26 SmartData Collective. All Rights Reserved.
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?