Cluster#
Clustering algorithm implementations using PyTorch.
- class anomalib.models.components.cluster.GaussianMixture(n_components, n_iter=100, tol=0.001)#
Bases:
DynamicBufferMixin
Gaussian Mixture Model.
- Parameters:
n_components (int) – Number of components.
n_iter (int) – Maximum number of iterations to perform. Defaults to
100
.tol (float) – Convergence threshold. Defaults to
1e-3
.
Example
The following examples shows how to fit a Gaussian Mixture Model to some data and get the cluster means and predicted labels and log-likelihood scores of the data.
>>> import torch >>> from anomalib.models.components.cluster import GaussianMixture >>> model = GaussianMixture(n_components=2) >>> data = torch.tensor( ... [ ... [2, 1], [2, 2], [2, 3], ... [7, 5], [8, 5], [9, 5], ... ] ... ).float() >>> model.fit(data) >>> model.means # get the means of the gaussians tensor([[8., 5.], [2., 2.]]) >>> model.predict(data) # get the predicted cluster label of each sample tensor([1, 1, 1, 0, 0, 0]) >>> model.score_samples(data) # get the log-likelihood score of each sample tensor([3.8295, 4.5795, 3.8295, 3.8295, 4.5795, 3.8295])
- fit(data)#
Fit the model to the data.
- Parameters:
data (Tensor) – Data to fit the model to. Tensor of shape (n_samples, n_features).
- Return type:
None
- predict(data)#
Predict the cluster labels of the data.
- Parameters:
data (Tensor) – Samples to assign to clusters. Tensor of shape (n_samples, n_features).
- Returns:
Tensor of shape (n_samples,) containing the predicted cluster label of each sample.
- Return type:
Tensor
- score_samples(data)#
Assign a likelihood score to each sample in the data.
- Parameters:
data (Tensor) – Samples to assign scores to. Tensor of shape (n_samples, n_features).
- Returns:
Tensor of shape (n_samples,) containing the log-likelihood score of each sample.
- Return type:
Tensor
- class anomalib.models.components.cluster.KMeans(n_clusters, max_iter=10)#
Bases:
object
Initialize the KMeans object.
- Parameters:
n_clusters (int) – The number of clusters to create.
max_iter (int, optional)) – The maximum number of iterations to run the algorithm. Defaults to 10.
- fit(inputs)#
Fit the K-means algorithm to the input data.
- Parameters:
inputs (torch.Tensor) – Input data of shape (batch_size, n_features).
- Returns:
A tuple containing the labels of the input data with respect to the identified clusters and the cluster centers themselves. The labels have a shape of (batch_size,) and the cluster centers have a shape of (n_clusters, n_features).
- Return type:
tuple
- Raises:
ValueError – If the number of clusters is less than or equal to 0.
- predict(inputs)#
Predict the labels of input data based on the fitted model.
- Parameters:
inputs (torch.Tensor) – Input data of shape (batch_size, n_features).
- Returns:
The predicted labels of the input data with respect to the identified clusters.
- Return type:
torch.Tensor
- Raises:
AttributeError – If the KMeans object has not been fitted to input data.