Zettelkasten/Terminology Information
Clustering
Computer-Nerd
2023. 4. 1. 12:38
Information
- Clustering is a type of unsupervised learning technique used to group together similar objects in a dataset.
- The goal of clustering is to minimize the intra-cluster distance while maximizing the inter-cluster distance.
- The most common clustering algorithms include K-Means, Hierarchical, DBSCAN, and Gaussian Mixture Models.
- K-Means clustering divides the dataset into k distinct groups based on distance between objects, with each group centered around the mean of its constituent points.
- Hierarchical clustering is a method of clustering objects in a tree-like structure, with each node in the tree representing a cluster of objects.
- DBSCAN is a density-based clustering algorithm that groups together objects that are closely packed together while leaving out objects in low-density regions.
- Gaussian Mixture Models use the Gaussian probability distribution to model the data and identify groups within the dataset based on the distribution of points.
- Clustering has a wide range of applications, including market segmentation, image segmentation, data compression, and anomaly detection.