GLASS#
Architecture#
GLASS - Unsupervised anomaly detection via Gradient Ascent for Industrial Anomaly detection and localization.
This module implements the GLASS model for unsupervised anomaly detection and localization. GLASS synthesizes both global and local anomalies using Gaussian noise guided by gradient ascent to enhance weak defect detection in industrial settings.
- The model consists of:
A feature extractor and feature adaptor to obtain robust normal representations
A Global Anomaly Synthesis (GAS) module that perturbs features using Gaussian noise and gradient ascent with truncated projection
A Local Anomaly Synthesis (LAS) module that overlays augmented textures onto images using Perlin noise masks
A shared discriminator trained with features from normal, global, and local synthetic samples
Paper: A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization <https://arxiv.org/pdf/2407.09359>
- class anomalib.models.image.glass.lightning_model.Glass(input_shape=(288, 288), anomaly_source_path=None, backbone='wide_resnet50_2', pretrain_embed_dim=1536, target_embed_dim=1536, patchsize=3, patchstride=1, pre_trained=True, layers=None, pre_projection=1, discriminator_layers=2, discriminator_hidden=1024, learning_rate=0.0001, step=20, svd=0, gaussian_noise_std=0.015, radius_quantile=0.75, focal_loss_quantile_threshold=0.5, mining=True, pre_processor=True, post_processor=True, evaluator=True, visualizer=True)#
Bases:
AnomalibModulePyTorch Lightning Implementation of the GLASS Model.
- The model uses a pre-trained feature extractor to extract features and a feature adaptor to mitigate latent domain
bias.
Global anomaly features are synthesized from adapted normal features using gradient ascent. Local anomaly images are synthesized using texture overlay datasets like dtd which are then processed by feature
extractor and feature adaptor.
All three different features are passed to the discriminator trained using loss functions.
- Parameters:
input_shape (
tuple[int,int]) – Input image dimensions as a tuple of (height, width). Required for shaping the input pipeline. Defaults to (288, 288).anomaly_source_path (
str|None) – Path to the dataset or source directory containing normal images and anomaly texturesbackbone (
str) – Name of the CNN backbone used for feature extraction. Defaults to “wide_resnet50_2”.pretrain_embed_dim (
int) –Dimensionality of features extracted by the pre-trained backbone before adaptation.
Defaults to 1536.
target_embed_dim (
int) – Dimensionality of the target adapted features after projection. Defaults to 1536.patchsize (
int) – Size of the local patch used in feature aggregation (e.g., for neighborhood pooling). Defaults to 3.patchstride (
int) – Stride used when extracting patches for local feature aggregation. Defaults to 1.pre_trained (
bool) – Whether to use ImageNet pre-trained weights for the backbone network. Defaults to True.layers (
list[str] |None) – List of backbone layers to extract features from. Defaults to [“layer2”, “layer3”].pre_projection (
int) –Number of projection layers used in the feature adaptor (e.g., MLP before discriminator).
Defaults to 1.
discriminator_layers (
int) – Number of layers in the discriminator network. Defaults to 2.discriminator_hidden (
int) – Number of hidden units in each discriminator layer. Defaults to 1024.learning_rate (
float) – Learning rate for training the feature adaptor and discriminator networks. Defaults to 0.0001.step (
int) – Number of gradient ascent steps for anomaly synthesis. Defaults to 20.svd (
int) – Flag to enable SVD-based feature projection. Defaults to 0.gaussian_noise_std (
float) – Standard deviation of Gaussian noise added to features for global anomaly synthesis. Defaults to0.015.radius_quantile (
float) – Quantile used to compute the truncated projection radius during gradient ascent. Defaults to0.75.focal_loss_quantile_threshold (
float) – Quantile threshold for hard example mining in focal loss computation. When0, all samples are used. Defaults to0.5.mining (
bool) – Whether to perform gradient ascent or skip it. Defaults toTrue.pre_processor (
PreProcessor|bool) – reprocessing module or flag to enable default preprocessing. Set to True to apply default normalization and resizing. Defaults to True.post_processor (
PostProcessor|bool) –Postprocessing module or flag to enable default output smoothing or thresholding.
Defaults to True.
evaluator (
Evaluator|bool) – Evaluation module for calculating metrics such as AUROC and PRO. Defaults to True.visualizer (
Visualizer|bool) –Visualization module to generate heatmaps, segmentation overlays, and anomaly scores.
Defaults to True.
- static configure_evaluator()#
Configure the evaluator with validation and test metrics.
Overrides the default evaluator to include both
image_AUROCandpixel_AUROCas validation metrics. The official GLASS implementation selects the best checkpoint based onimage_auroc + pixel_auroc, so both must be available during validation.- Returns:
Configured evaluator with both validation and test metrics.
- Return type:
Example
>>> evaluator = Glass.configure_evaluator() >>> len(evaluator.val_metrics) > 0 True
- configure_optimizers()#
Configure optimizers for the discriminator, projection, and backbone.
Returns all active optimizers in a fixed order: discriminator first, then projection (if
pre_projection > 0), then backbone (if not pre-trained). This ordering is critical for checkpoint resume.
- classmethod configure_pre_processor(image_size=None, center_crop_size=None)#
Configure the default pre-processor for GLASS.
If valid center_crop_size is provided, the pre-processor will also perform center cropping, according to the paper.
- Parameters:
- Returns:
Configured pre-processor instance.
- Return type:
- Raises:
ValueError – If at least one dimension of
center_crop_sizeis larger than correspondentimage_sizedimension.
Example
>>> pre_processor = Glass.configure_pre_processor( ... image_size=(288, 288) ... ) >>> transformed_image = pre_processor(image)
- property learning_type: LearningType#
Return the learning type of the model.
- Returns:
Learning type (ONE_CLASS for GLASS)
- Return type:
LearningType
- on_train_epoch_start()#
Initialize model by computing mean feature representation across training dataset.
This method is called at the start of training and computes a mean feature vector that serves as a reference point for the normal class distribution.
- Return type:
- training_step(batch, batch_idx)#
Training step for GLASS model.
- validation_step(batch, batch_idx)#
Performs a single validation step during model evaluation.
- Parameters:
- Returns:
Output of the validation step, usually containing predictions and any associated metrics.
- Return type:
GLASS - Unsupervised anomaly detection via Gradient Ascent for Industrial Anomaly detection and localization.
This module implements the GLASS model for unsupervised anomaly detection and localization. GLASS synthesizes both global and local anomalies using Gaussian noise guided by gradient ascent to enhance weak defect detection in industrial settings.
- The model consists of:
A feature extractor and feature adaptor to obtain robust normal representations
A Global Anomaly Synthesis (GAS) module that perturbs features using Gaussian noise and gradient ascent with truncated projection
A Local Anomaly Synthesis (LAS) module that overlays augmented textures onto images using Perlin noise masks
A shared discriminator trained with features from normal, global, and local synthetic samples
Paper: A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization <https://arxiv.org/pdf/2407.09359>
- class anomalib.models.image.glass.torch_model.GlassModel(input_shape=(288, 288), anomaly_source_path=None, pretrain_embed_dim=1536, target_embed_dim=1536, backbone='wide_resnet50_2', patchsize=3, patchstride=1, pre_trained=True, layers=None, pre_projection=1, discriminator_layers=2, discriminator_hidden=1024, step=20, svd=0, gaussian_noise_std=0.015, radius_quantile=0.75, focal_loss_quantile_threshold=0.5, mining=True, normalize_mean=None, normalize_std=None)#
Bases:
ModulePyTorch Implementation of the GLASS Model.
- calculate_anomaly_scores(images)#
Calculates anomaly scores and segmentation masks for input images.
- calculate_center(dataloader, device)#
Calculates and updates the center embedding from a dataset.
This method runs the model in evaluation mode and computes the mean feature representation (center) across the entire dataset. The center is used for further downstream tasks such as anomaly detection or feature normalization.
- Parameters:
dataloader (
DataLoader) – A PyTorch DataLoader providing batches of data, where each batch contains animageattribute.device (
device) – The device on which tensors should be processed (e.g.,torch.device("cuda")ortorch.device("cpu")).
- Returns:
The method updates
self.centerin-place with the computed center tensor.- Return type:
- calculate_features(img, aug, evaluation=False)#
Calculate and return feature embeddings for the input and augmented images.
Depending on whether a pre-projection module is used, this method optionally applies it to the embeddings before returning them.
- Parameters:
- Returns:
- A tuple containing the feature embeddings
for the original image (true_feats), the augmented image (fake_feats), and the patch grid shapes from the first feature level.
- Return type:
- forward(img)#
Forward pass for training and inference.
During training, synthesizes global and local anomalies and computes the combined loss (BCE + focal). During inference, skips augmentation entirely and directly computes anomaly scores and segmentation masks.
- generate_embeddings(images, evaluation=False)#
Generates patch-wise feature embeddings for a batch of input images.
Extracts multi-scale features, patchifies them, aligns spatial sizes via bilinear interpolation, then preprocesses and aggregates into a single embedding tensor.
- Parameters:
- Returns:
Patch-level embeddings of shape (B*N, D) where N is patches per image.
List of (height, width) patch counts per feature level.
- Return type: