DSR#
This is the implementation of the DSR paper.
Model Type: Segmentation
Description#
DSR is a quantized-feature based algorithm that consists of an autoencoder with one encoder and two decoders, coupled with an anomaly detection module. DSR learns a codebook of quantized representations on ImageNet, which are then used to encode input images. These quantized representations also serve to sample near-in-distribution anomalies, since they do not rely on external datasets. Training takes place in three phases. The encoder and “general object decoder”, as well as the codebook, are pretrained on ImageNet. Defects are then generated at the feature level using the codebook on the quantized representations, and are used to train the object-specific decoder as well as the anomaly detection module. In the final phase of training, the upsampling module is trained on simulated image-level smudges in order to output more robust anomaly maps.
Architecture#
PyTorch model for the DSR model implementation.
This module implements the PyTorch model for Deep Spatial Reconstruction (DSR). DSR is an anomaly detection model that uses a discrete latent model, image reconstruction network, subspace restriction modules, anomaly detection module and upsampling module to detect anomalies in images.
The model works by: 1. Encoding input images into quantized feature maps 2. Reconstructing images using a general appearance decoder 3. Detecting anomalies by comparing reconstructed and original images
Example
>>> from anomalib.models.image.dsr.torch_model import DsrModel
>>> model = DsrModel()
>>> input_tensor = torch.randn(32, 3, 256, 256)
>>> output = model(input_tensor)
>>> output["anomaly_map"].shape
torch.Size([32, 256, 256])
Notes
The model implementation is based on the original DSR paper and code. Original code: VitjanZ/DSR_anomaly_detection
References
Original paper: https://arxiv.org/abs/2012.12436
- class anomalib.models.image.dsr.torch_model.AnomalyDetectionModule(in_channels, out_channels, base_width)#
Bases:
ModuleAnomaly detection module.
Module that detects the preseßnce of an anomaly by comparing two images reconstructed by the object specific decoder and the general object decoder.
- Parameters:
- forward(batch_real, batch_anomaly)#
Computes the anomaly map over corresponding real and anomalous images.
- class anomalib.models.image.dsr.torch_model.DecoderBot(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
ModuleGeneral appearance decoder module to reconstruct images while keeping possible anomalies.
- Parameters:
- class anomalib.models.image.dsr.torch_model.DiscreteLatentModel(num_hiddens, num_residual_layers, num_residual_hiddens, num_embeddings, embedding_dim)#
Bases:
ModuleDiscrete Latent Model.
Autoencoder quantized model that encodes the input images into quantized feature maps and generates a reconstructed image using the general appearance decoder.
- Parameters:
- forward(batch, anomaly_mask=None, anom_str_lo=None, anom_str_hi=None)#
Generate quantized feature maps.
Generates quantized feature maps of batch of input images as well as their reconstruction based on the general appearance decoder.
- Parameters:
- Returns:
- If generating an anomaly mask:
General object decoder-decoded anomalous image
Reshaped ground truth anomaly map
Non defective quantized lo feature
Non defective quantized hi feature
Non quantized subspace encoded defective lo feature
Non quantized subspace encoded defective hi feature
- Else:
General object decoder-decoded image
Quantized lo feature
Quantized hi feature
- Return type:
- static generate_fake_anomalies_joined(features, embeddings, memory_torch_original, mask, strength)#
Generate quantized anomalies.
- Parameters:
- Returns:
Anomalous embedding.
- Return type:
- property vq_vae_bot: VectorQuantizer#
Return
self._vq_vae_bot.
- property vq_vae_top: VectorQuantizer#
Return
self._vq_vae_top.
- class anomalib.models.image.dsr.torch_model.DsrModel(latent_anomaly_strength=0.2, embedding_dim=128, num_embeddings=4096, num_hiddens=128, num_residual_layers=2, num_residual_hiddens=64)#
Bases:
ModuleDSR PyTorch model.
Consists of the discrete latent model, image reconstruction network, subspace restriction modules, anomaly detection module and upsampling module.
- Parameters:
embedding_dim (
int) – Dimension of codebook embeddings. Defaults to128.num_embeddings (
int) – Number of embeddings in codebook. Defaults to4096.latent_anomaly_strength (
float) – Strength of the generated anomalies in latent space. Defaults to0.2.num_hiddens (
int) – Number of output channels in residual layers. Defaults to128.num_residual_layers (
int) – Number of residual layers. Defaults to2.num_residual_hiddens (
int) – Number of intermediate channels in residual layers. Defaults to64.
Example
>>> model = DsrModel() >>> input_tensor = torch.randn(32, 3, 256, 256) >>> output = model(input_tensor) >>> output["anomaly_map"].shape torch.Size([32, 256, 256])
- forward(batch, anomaly_map_to_generate=None)#
Forward pass through the model.
- Parameters:
- Returns:
Output depends on mode:
- If testing:
anomaly_map: Upsampled anomaly mappred_score: Image anomaly score
- If training phase 2:
recon_feat_hi: Reconstructed non-quantized hi features (F~_hi)recon_feat_lo: Reconstructed non-quantized lo features (F~_lo)embedding_bot: Quantized features of non defective img (Q_hi)embedding_top: Quantized features of non defective img (Q_lo)obj_spec_image: Object-specific-decoded image (I_spc)anomaly_map: Predicted segmentation mask (M)true_mask: Resized ground-truth anomaly map (M_gt)
- If training phase 3:
anomaly_map: Reconstructed anomaly map
- Return type:
dict[str,Tensor] |InferenceBatch- Raises:
RuntimeError – If
anomaly_map_to_generateis provided when not in training mode.
Example
>>> model = DsrModel() >>> input_tensor = torch.randn(32, 3, 256, 256) >>> output = model(input_tensor) >>> output["anomaly_map"].shape torch.Size([32, 256, 256])
- class anomalib.models.image.dsr.torch_model.EncoderBot(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
ModuleEncoder module for bottom quantized feature maps.
- Parameters:
- class anomalib.models.image.dsr.torch_model.EncoderTop(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
ModuleEncoder module for top quantized feature maps.
- Parameters:
- class anomalib.models.image.dsr.torch_model.FeatureDecoder(base_width, out_channels=1)#
Bases:
ModuleFeature decoder for the subspace restriction network.
- Parameters:
- forward(_, __, b3)#
Decode a batch of latent features to a non-quantized representation.
- Parameters:
_ (torch.Tensor) – Top latent feature layer.
__ (torch.Tensor) – Middle latent feature layer.
b3 (
Tensor) – Bottom latent feature layer.
- Return type:
- Returns:
Decoded non-quantized representation.
- class anomalib.models.image.dsr.torch_model.FeatureEncoder(in_channels, base_width)#
Bases:
ModuleFeature encoder for the subspace restriction network.
- Parameters:
- class anomalib.models.image.dsr.torch_model.ImageReconstructionNetwork(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
ModuleImage Reconstruction Network.
Image reconstruction network that reconstructs the image from a quantized representation.
- Parameters:
- class anomalib.models.image.dsr.torch_model.Residual(in_channels, out_channels, num_residual_hiddens)#
Bases:
ModuleResidual layer.
- Parameters:
- class anomalib.models.image.dsr.torch_model.ResidualStack(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
ModuleStack of residual layers.
- Parameters:
- class anomalib.models.image.dsr.torch_model.SubspaceRestrictionModule(base_width)#
Bases:
ModuleSubspace Restriction Module.
Subspace restriction module that restricts the appearance subspace into configurations that agree with normal appearances and applies quantization.
- Parameters:
base_width (
int) – Base dimensionality of the layers of the autoencoder.
- forward(batch, quantization)#
Generate the quantized anomaly-free representation of an anomalous image.
- class anomalib.models.image.dsr.torch_model.SubspaceRestrictionNetwork(in_channels=64, out_channels=64, base_width=64)#
Bases:
ModuleSubspace Restriction Network.
Subspace restriction network that reconstructs the input image into a non-quantized configuration that agrees with normal appearances.
- Parameters:
- forward(batch)#
Reconstruct non-quantized representation from batch.
Generate non-quantized feature maps from potentially anomalous images, to be quantized into non-anomalous quantized representations.
- class anomalib.models.image.dsr.torch_model.UnetDecoder(base_width, out_channels=1)#
Bases:
ModuleDecoder of the Unet network.
- Parameters:
- forward(b1, b2, b3, b4)#
Decodes latent representations into an image.
- class anomalib.models.image.dsr.torch_model.UnetEncoder(in_channels, base_width)#
Bases:
ModuleEncoder of the Unet network.
- Parameters:
- class anomalib.models.image.dsr.torch_model.UnetModel(in_channels=64, out_channels=64, base_width=64)#
Bases:
ModuleAutoencoder model that reconstructs the input image.
- Parameters:
- class anomalib.models.image.dsr.torch_model.UpsamplingModule(in_channels=8, out_channels=2, base_width=64)#
Bases:
ModuleModule that upsamples the generated anomaly mask to full resolution.
- Parameters:
- forward(batch_real, batch_anomaly, batch_segmentation_map)#
Computes upsampled segmentation maps.
- class anomalib.models.image.dsr.torch_model.VectorQuantizer(num_embeddings, embedding_dim)#
Bases:
ModuleModule that quantizes a given feature map using learned quantization codebooks.
- Parameters:
DSR - A Dual Subspace Re-Projection Network for Surface Anomaly Detection.
This module implements the DSR model for surface anomaly detection. DSR uses a dual subspace re-projection approach to detect anomalies by comparing input images with their reconstructions in two different subspaces.
The model consists of three training phases: 1. A discrete model pre-training phase (using pre-trained weights) 2. Training of the main reconstruction and anomaly detection modules 3. Training of the upsampling module
Paper: https://link.springer.com/chapter/10.1007/978-3-031-19821-2_31
Example
>>> from anomalib.models.image import Dsr
>>> model = Dsr(
... latent_anomaly_strength=0.2,
... upsampling_train_ratio=0.7
... )
The model can be used with any of the supported datasets and task modes in anomalib.
Notes
The model requires pre-trained weights for the discrete model which are downloaded automatically during training.
See also
anomalib.models.image.dsr.torch_model.DsrModel:PyTorch implementation of the DSR model architecture.
- class anomalib.models.image.dsr.lightning_model.Dsr(latent_anomaly_strength=0.2, upsampling_train_ratio=0.7, pre_processor=True, post_processor=True, evaluator=True, visualizer=True)#
Bases:
AnomalibModuleDSR: A Dual Subspace Re-Projection Network for Surface Anomaly Detection.
The model uses a dual subspace approach with three training phases: 1. Pre-trained discrete model (loaded from weights) 2. Training of reconstruction and anomaly detection modules 3. Training of the upsampling module for final anomaly map generation
- Parameters:
latent_anomaly_strength (
float) – Strength of the generated anomalies in the latent space. Defaults to0.2.upsampling_train_ratio (
float) – Ratio of training steps for the upsampling module. Defaults to0.7.pre_processor (
PreProcessor|bool) – Pre-processor instance or flag to use default. Defaults toTrue.post_processor (
PostProcessor|bool) – Post-processor instance or flag to use default. Defaults toTrue.evaluator (
Evaluator|bool) – Evaluator instance or flag to use default. Defaults toTrue.visualizer (
Visualizer|bool) – Visualizer instance or flag to use default. Defaults toTrue.
Example
>>> from anomalib.models.image import Dsr >>> model = Dsr( ... latent_anomaly_strength=0.2, ... upsampling_train_ratio=0.7 ... ) >>> model.trainer_arguments {'num_sanity_val_steps': 0}
- configure_optimizers()#
Configure the Adam optimizer for training phases 2 and 3.
Does not train the discrete model (phase 1)
- Returns:
Dictionary containing optimizers and schedulers.
- Return type:
Union[Optimizer,Sequence[Optimizer],tuple[Sequence[Optimizer],Sequence[Union[LRScheduler,ReduceLROnPlateau,LRSchedulerConfig]]],OptimizerConfig,OptimizerLRSchedulerConfig,Sequence[OptimizerConfig],Sequence[OptimizerLRSchedulerConfig],None]
Example
>>> model = Dsr() >>> optimizers = model.configure_optimizers() >>> isinstance(optimizers, tuple) True >>> len(optimizers) 2
- classmethod configure_pre_processor(image_size=None)#
Configure default pre-processor for DSR.
Note
Imagenet normalization is not used in this model.
- property learning_type: LearningType#
Return the learning type of the model.
- Returns:
Learning type of the model.
- Return type:
LearningType
Example
>>> model = Dsr() >>> model.learning_type <LearningType.ONE_CLASS: 'one_class'>
- on_train_epoch_start()#
Display a message when starting to train the upsampling module.
- Return type:
- on_train_start()#
Set up model before training begins.
Performs the following steps: 1. Validates that pre_processor uses no normalization 2. Load pretrained weights of the discrete model
- Raises:
ValueError – If transforms contain normalization.
- Return type:
- static prepare_pretrained_model()#
Download pre-trained models if they don’t exist.
- Returns:
Path to the downloaded pre-trained model weights.
- Return type:
Example
>>> model = Dsr() >>> weights_path = model.prepare_pretrained_model() >>> weights_path.name 'vq_model_pretrained_128_4096.pckl'
- property trainer_arguments: dict[str, Any]#
Required trainer arguments.
Example
>>> model = Dsr() >>> model.trainer_arguments {'num_sanity_val_steps': 0}
- training_step(batch)#
Training Step of DSR.
During the first phase, feeds the original image and simulated anomaly mask. During second phase, feeds a generated anomalous image to train the upsampling module.
- Parameters:
batch (
Batch) – Input batch containing image, label and mask- Returns:
Dictionary containing the loss value
- Return type:
Example
>>> from anomalib.data import Batch >>> model = Dsr() >>> batch = Batch( ... image=torch.randn(8, 3, 256, 256), ... label=torch.zeros(8) ... ) >>> output = model.training_step(batch) >>> isinstance(output, dict) True >>> "loss" in output True
- validation_step(batch, *args, **kwargs)#
Validation step of DSR.
The Softmax predictions of the anomalous class are used as anomaly map.
- Parameters:
batch (
Batch) – Input batch containing image, label and mask*args – Additional positional arguments (unused)
**kwargs – Additional keyword arguments (unused)
- Returns:
Dictionary containing predictions and batch information
- Return type:
Example
>>> from anomalib.data import Batch >>> model = Dsr() >>> batch = Batch( ... image=torch.randn(8, 3, 256, 256), ... label=torch.zeros(8) ... ) >>> output = model.validation_step(batch) >>> isinstance(output, Batch) True
Anomaly generator for the DSR model implementation.
This module implements an anomaly generator that creates synthetic anomalies using Perlin noise. The generator is used during the second phase of DSR model training to create anomalous samples.
Example
>>> from anomalib.models.image.dsr.anomaly_generator import DsrAnomalyGenerator
>>> generator = DsrAnomalyGenerator(p_anomalous=0.5)
>>> batch = torch.randn(8, 3, 256, 256)
>>> masks = generator.augment_batch(batch)
- class anomalib.models.image.dsr.anomaly_generator.DsrAnomalyGenerator(p_anomalous=0.5)#
Bases:
ModuleAnomaly generator for the DSR model.
The generator creates synthetic anomalies by applying Perlin noise to images. It is used during the second phase of DSR model training. The third phase uses a different approach with smudge-based anomalies.
- Parameters:
p_anomalous (
float) – Probability of generating an anomalous image. Defaults to0.5.
Example
>>> generator = DsrAnomalyGenerator(p_anomalous=0.7) >>> batch = torch.randn(4, 3, 256, 256) >>> masks = generator.augment_batch(batch) >>> assert masks.shape == (4, 1, 256, 256)
- augment_batch(batch)#
Generate anomalous masks for a batch of images.
- Parameters:
batch (
Tensor) – Input batch of images of shape(batch_size, channels, height, width).- Returns:
- Batch of binary masks of shape
(batch_size, 1, height, width)where1indicates anomalous regions.
- Return type:
Example
>>> generator = DsrAnomalyGenerator() >>> batch = torch.randn(8, 3, 256, 256) >>> masks = generator.augment_batch(batch) >>> assert masks.shape == (8, 1, 256, 256) >>> assert torch.all((masks >= 0) & (masks <= 1))
- generate_anomaly(height, width, device=None)#
Generate an anomalous mask using Perlin noise.
- Parameters:
- Returns:
- Binary mask of shape
(1, height, width)where1 indicates anomalous regions.
- Binary mask of shape
- Return type:
Example
>>> generator = DsrAnomalyGenerator() >>> mask = generator.generate_anomaly(256, 256) >>> assert mask.shape == (1, 256, 256) >>> assert torch.all((mask >= 0) & (mask <= 1))
Loss functions for the DSR model implementation.
This module contains the loss functions used in the second and third training phases of the DSR model.
Example
>>> from anomalib.models.image.dsr.loss import DsrSecondStageLoss
>>> loss_fn = DsrSecondStageLoss()
>>> loss = loss_fn(
... recon_nq_hi=recon_nq_hi,
... recon_nq_lo=recon_nq_lo,
... qu_hi=qu_hi,
... qu_lo=qu_lo,
... input_image=input_image,
... gen_img=gen_img,
... seg=seg,
... anomaly_mask=anomaly_mask
... )
- class anomalib.models.image.dsr.loss.DsrSecondStageLoss#
Bases:
ModuleLoss function for the second training phase of the DSR model.
- The total loss is a combination of:
MSE loss between non-anomalous quantized input image and anomalous subspace-reconstructed non-quantized input (hi and lo features)
MSE loss between input image and reconstructed image through object-specific decoder
Focal loss between computed segmentation mask and ground truth mask
Example
>>> loss_fn = DsrSecondStageLoss() >>> loss = loss_fn( ... recon_nq_hi=recon_nq_hi, ... recon_nq_lo=recon_nq_lo, ... qu_hi=qu_hi, ... qu_lo=qu_lo, ... input_image=input_image, ... gen_img=gen_img, ... seg=seg, ... anomaly_mask=anomaly_mask ... )
- forward(recon_nq_hi, recon_nq_lo, qu_hi, qu_lo, input_image, gen_img, seg, anomaly_mask)#
Compute the combined loss over a batch.
- Parameters:
recon_nq_hi (
Tensor) – Reconstructed non-quantized hi featurerecon_nq_lo (
Tensor) – Reconstructed non-quantized lo featurequ_hi (
Tensor) – Non-defective quantized hi featurequ_lo (
Tensor) – Non-defective quantized lo featureinput_image (
Tensor) – Original input imagegen_img (
Tensor) – Object-specific decoded imageseg (
Tensor) – Computed anomaly segmentation mapanomaly_mask (
Tensor) – Ground truth anomaly mask
- Returns:
Total combined loss value
- Return type:
Example
>>> loss_fn = DsrSecondStageLoss() >>> loss = loss_fn( ... recon_nq_hi=torch.randn(32, 64, 32, 32), ... recon_nq_lo=torch.randn(32, 64, 32, 32), ... qu_hi=torch.randn(32, 64, 32, 32), ... qu_lo=torch.randn(32, 64, 32, 32), ... input_image=torch.randn(32, 3, 256, 256), ... gen_img=torch.randn(32, 3, 256, 256), ... seg=torch.randn(32, 2, 256, 256), ... anomaly_mask=torch.randint(0, 2, (32, 1, 256, 256)) ... )
- class anomalib.models.image.dsr.loss.DsrThirdStageLoss#
Bases:
ModuleLoss function for the third training phase of the DSR model.
The loss consists of a focal loss between the computed segmentation mask and the ground truth mask.
Example
>>> loss_fn = DsrThirdStageLoss() >>> loss = loss_fn( ... pred_mask=pred_mask, ... true_mask=true_mask ... )
- forward(pred_mask, true_mask)#
Compute the focal loss between predicted and true masks.
- Parameters:
- Returns:
Focal loss value
- Return type:
Example
>>> loss_fn = DsrThirdStageLoss() >>> loss = loss_fn( ... pred_mask=torch.randn(32, 2, 256, 256), ... true_mask=torch.randint(0, 2, (32, 1, 256, 256)) ... )