Cluster#
Clustering algorithm implementations using PyTorch.
This module provides clustering algorithms implemented in PyTorch for anomaly detection tasks.
- Classes:
GaussianMixture: Gaussian Mixture Model for density estimation and clustering. KMeans: K-Means clustering algorithm.
Example
>>> from anomalib.models.components.cluster import GaussianMixture, KMeans
>>> # Create and fit a GMM
>>> gmm = GaussianMixture(n_components=3)
>>> features = torch.randn(100, 10) # Example features
>>> gmm.fit(features)
>>> # Create and fit KMeans
>>> kmeans = KMeans(n_clusters=5)
>>> kmeans.fit(features)
- class anomalib.models.components.cluster.GaussianMixture(n_components, n_iter=100, tol=0.001)#
Bases:
DynamicBufferMixin
Gaussian Mixture Model for clustering data into Gaussian distributions.
- Parameters:
- means#
Means of the Gaussian components. Shape:
(n_components, n_features)
.- Type:
- covariances#
Covariance matrices of components. Shape:
(n_components, n_features, n_features)
.- Type:
- weights#
Mixing weights of components. Shape:
(n_components,)
.- Type:
Example
>>> import torch >>> from anomalib.models.components.cluster import GaussianMixture >>> # Create synthetic data with two clusters >>> data = torch.tensor([ ... [2, 1], [2, 2], [2, 3], # Cluster 1 ... [7, 5], [8, 5], [9, 5], # Cluster 2 ... ]).float() >>> # Initialize and fit GMM >>> model = GaussianMixture(n_components=2) >>> model.fit(data) >>> # Get cluster means >>> model.means tensor([[8., 5.], [2., 2.]]) >>> # Predict cluster assignments >>> model.predict(data) tensor([1, 1, 1, 0, 0, 0]) >>> # Get log-likelihood scores >>> model.score_samples(data) tensor([3.8295, 4.5795, 3.8295, 3.8295, 4.5795, 3.8295])
- fit(data)#
Fit the GMM to the input data using EM algorithm.
- Parameters:
data (torch.Tensor) – Input data to fit the model to. Shape:
(n_samples, n_features)
.- Return type:
- predict(data)#
Predict cluster assignments for the input data.
- Parameters:
data (torch.Tensor) – Input samples. Shape:
(n_samples, n_features)
.- Returns:
- Predicted cluster labels.
Shape:
(n_samples,)
.
- Return type:
- score_samples(data)#
Compute per-sample likelihood scores.
- Parameters:
data (torch.Tensor) – Input samples to score. Shape:
(n_samples, n_features)
.- Returns:
- Log-likelihood scores.
Shape:
(n_samples,)
.
- Return type:
- class anomalib.models.components.cluster.KMeans(n_clusters, max_iter=10)#
Bases:
object
K-means clustering algorithm implementation.
- Parameters:
- cluster_centers_#
Coordinates of cluster centers after fitting. Shape:
(n_clusters, n_features)
.- Type:
- labels_#
Cluster labels for the training data after fitting. Shape:
(n_samples,)
.- Type:
Example
>>> import torch >>> from anomalib.models.components.cluster import KMeans >>> kmeans = KMeans(n_clusters=3) >>> data = torch.randn(100, 5) # 100 samples, 5 features >>> labels, centers = kmeans.fit(data) >>> print(f"Cluster assignments shape: {labels.shape}") >>> print(f"Cluster centers shape: {centers.shape}")
- fit(inputs)#
Fit the K-means algorithm to the input data.
- Parameters:
inputs (torch.Tensor) – Input data to cluster. Shape:
(n_samples, n_features)
.- Returns:
- Tuple containing:
labels: Cluster assignments for each input point. Shape:
(n_samples,)
cluster_centers: Coordinates of the cluster centers. Shape:
(n_clusters, n_features)
- Return type:
- Raises:
ValueError – If
n_clusters
is less than or equal to 0.
Example
>>> kmeans = KMeans(n_clusters=2) >>> data = torch.tensor([[1.0, 2.0], [4.0, 5.0], [1.2, 2.1]]) >>> labels, centers = kmeans.fit(data) >>> print(f"Number of points in each cluster: { ... [(labels == i).sum().item() for i in range(2)] ... }")
- predict(inputs)#
Predict cluster labels for input data.
- Parameters:
inputs (torch.Tensor) – Input data to assign to clusters. Shape:
(n_samples, n_features)
.- Returns:
- Predicted cluster labels.
Shape:
(n_samples,)
.
- Return type:
- Raises:
AttributeError – If called before fitting the model.
Example
>>> kmeans = KMeans(n_clusters=2) >>> # First fit the model >>> train_data = torch.tensor([[1.0, 2.0], [4.0, 5.0]]) >>> kmeans.fit(train_data) >>> # Then predict on new data >>> new_data = torch.tensor([[1.1, 2.1], [3.9, 4.8]]) >>> predictions = kmeans.predict(new_data)