Member-only story
DBSCAN: When K-Means Can’t Cut Clustering Alone
Is there a reasonable alternative for K-means clustering?
Clustering in machine learning is the process of grouping similar data points together based on their features, without using labeled data. It’s a type of unsupervised learning where the goal is to find natural groupings or patterns in the data.
If you have customer data, clustering could help you find groups of customers with similar buying habits. Then you can use your marketing skills to focus on the hotspots of customer attention. Customers that are alike will have similar interests. For example, fashion conscious individuals are probably spending more time on the Vogue website than TechCrunch. Apple uses clustering algorithms daily on your iPhone. The software finds frequently occurring photos with similar themes and groups them together in albums and shows.
Here is a simple analogy:
Imagine you have a pile of colored balls mixed together. Clustering is like sorting them into groups based on their color, even though no one told you what the colors are called. All you care about is that “like goes with like” and whether something is labeled as magenta, hot pink or mauve is beside the point.
