DSR#
This is the implementation of the DSR paper.
Model Type: Segmentation
Description#
DSR is a quantized-feature based algorithm that consists of an autoencoder with one encoder and two decoders, coupled with an anomaly detection module. DSR learns a codebook of quantized representations on ImageNet, which are then used to encode input images. These quantized representations also serve to sample near-in-distribution anomalies, since they do not rely on external datasets. Training takes place in three phases. The encoder and “general object decoder”, as well as the codebook, are pretrained on ImageNet. Defects are then generated at the feature level using the codebook on the quantized representations, and are used to train the object-specific decoder as well as the anomaly detection module. In the final phase of training, the upsampling module is trained on simulated image-level smudges in order to output more robust anomaly maps.
Architecture#

PyTorch model for the DSR model implementation.
This module implements the PyTorch model for Deep Spatial Reconstruction (DSR). DSR is an anomaly detection model that uses a discrete latent model, image reconstruction network, subspace restriction modules, anomaly detection module and upsampling module to detect anomalies in images.
The model works by: 1. Encoding input images into quantized feature maps 2. Reconstructing images using a general appearance decoder 3. Detecting anomalies by comparing reconstructed and original images
Example
>>> from anomalib.models.image.dsr.torch_model import DsrModel
>>> model = DsrModel()
>>> input_tensor = torch.randn(32, 3, 256, 256)
>>> output = model(input_tensor)
>>> output["anomaly_map"].shape
torch.Size([32, 256, 256])
Notes
The model implementation is based on the original DSR paper and code. Original code: VitjanZ/DSR_anomaly_detection
References
Original paper: https://arxiv.org/abs/2012.12436
- class anomalib.models.image.dsr.torch_model.AnomalyDetectionModule(in_channels, out_channels, base_width)#
Bases:
Module
Anomaly detection module.
Module that detects the preseßnce of an anomaly by comparing two images reconstructed by the object specific decoder and the general object decoder.
- Parameters:
- forward(batch_real, batch_anomaly)#
Computes the anomaly map over corresponding real and anomalous images.
- Parameters:
batch_real (torch.Tensor) – Batch of real, non defective images.
batch_anomaly (torch.Tensor) – Batch of potentially anomalous images.
- Return type:
- Returns:
The anomaly segmentation map.
- class anomalib.models.image.dsr.torch_model.DecoderBot(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
General appearance decoder module to reconstruct images while keeping possible anomalies.
- Parameters:
- forward(inputs)#
Decode quantized feature maps into an image.
- Parameters:
inputs (torch.Tensor) – Quantized feature maps.
- Return type:
- Returns:
Decoded image.
- class anomalib.models.image.dsr.torch_model.DiscreteLatentModel(num_hiddens, num_residual_layers, num_residual_hiddens, num_embeddings, embedding_dim)#
Bases:
Module
Discrete Latent Model.
Autoencoder quantized model that encodes the input images into quantized feature maps and generates a reconstructed image using the general appearance decoder.
- Parameters:
- forward(batch, anomaly_mask=None, anom_str_lo=None, anom_str_hi=None)#
Generate quantized feature maps.
Generates quantized feature maps of batch of input images as well as their reconstruction based on the general appearance decoder.
- Parameters:
batch (Tensor) – Batch of input images.
anomaly_mask (Tensor | None) – Anomaly mask to be used to generate anomalies on the quantized feature maps.
anom_str_lo (torch.Tensor | None) – Strength of generated anomaly lo.
anom_str_hi (torch.Tensor | None) – Strength of generated anomaly hi.
- Returns:
- If generating an anomaly mask:
General object decoder-decoded anomalous image
Reshaped ground truth anomaly map
Non defective quantized lo feature
Non defective quantized hi feature
Non quantized subspace encoded defective lo feature
Non quantized subspace encoded defective hi feature
- Else:
General object decoder-decoded image
Quantized lo feature
Quantized hi feature
- Return type:
- static generate_fake_anomalies_joined(features, embeddings, memory_torch_original, mask, strength)#
Generate quantized anomalies.
- Parameters:
features (torch.Tensor) – Features on which the anomalies will be generated.
embeddings (torch.Tensor) – Embeddings to use to generate the anomalies.
memory_torch_original (torch.Tensor) – Weight of embeddings.
mask (torch.Tensor) – Original anomaly mask.
strength (float) – Strength of generated anomaly.
- Returns:
Anomalous embedding.
- Return type:
- property vq_vae_bot: VectorQuantizer#
Return
self._vq_vae_bot
.
- property vq_vae_top: VectorQuantizer#
Return
self._vq_vae_top
.
- class anomalib.models.image.dsr.torch_model.DsrModel(latent_anomaly_strength=0.2, embedding_dim=128, num_embeddings=4096, num_hiddens=128, num_residual_layers=2, num_residual_hiddens=64)#
Bases:
Module
DSR PyTorch model.
Consists of the discrete latent model, image reconstruction network, subspace restriction modules, anomaly detection module and upsampling module.
- Parameters:
embedding_dim (int) – Dimension of codebook embeddings. Defaults to
128
.num_embeddings (int) – Number of embeddings in codebook. Defaults to
4096
.latent_anomaly_strength (float) – Strength of the generated anomalies in latent space. Defaults to
0.2
.num_hiddens (int) – Number of output channels in residual layers. Defaults to
128
.num_residual_layers (int) – Number of residual layers. Defaults to
2
.num_residual_hiddens (int) – Number of intermediate channels in residual layers. Defaults to
64
.
Example
>>> model = DsrModel() >>> input_tensor = torch.randn(32, 3, 256, 256) >>> output = model(input_tensor) >>> output["anomaly_map"].shape torch.Size([32, 256, 256])
- forward(batch, anomaly_map_to_generate=None)#
Forward pass through the model.
- Parameters:
batch (torch.Tensor) – Input batch of images.
anomaly_map_to_generate (torch.Tensor | None, optional) – Anomaly map to use for generating quantized defects. Should be
None
if not in training phase 2. Defaults toNone
.
- Returns:
Output depends on mode:
- If testing:
anomaly_map
: Upsampled anomaly mappred_score
: Image anomaly score
- If training phase 2:
recon_feat_hi
: Reconstructed non-quantized hi features (F~_hi)recon_feat_lo
: Reconstructed non-quantized lo features (F~_lo)embedding_bot
: Quantized features of non defective img (Q_hi)embedding_top
: Quantized features of non defective img (Q_lo)obj_spec_image
: Object-specific-decoded image (I_spc)anomaly_map
: Predicted segmentation mask (M)true_mask
: Resized ground-truth anomaly map (M_gt)
- If training phase 3:
anomaly_map
: Reconstructed anomaly map
- Return type:
- Raises:
RuntimeError – If
anomaly_map_to_generate
is provided when not in training mode.
Example
>>> model = DsrModel() >>> input_tensor = torch.randn(32, 3, 256, 256) >>> output = model(input_tensor) >>> output["anomaly_map"].shape torch.Size([32, 256, 256])
- load_pretrained_discrete_model_weights(ckpt, device=None)#
Load pre-trained model weights from checkpoint file.
- Parameters:
ckpt (Path) – Path to checkpoint file containing model weights.
device (torch.device | str | None, optional) – Device to load weights to. Defaults to
None
.
- Return type:
- class anomalib.models.image.dsr.torch_model.EncoderBot(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
Encoder module for bottom quantized feature maps.
- Parameters:
- forward(batch)#
Encode inputs to be quantized into the bottom feature map.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
- Returns:
Encoded feature maps.
- class anomalib.models.image.dsr.torch_model.EncoderTop(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
Encoder module for top quantized feature maps.
- Parameters:
- forward(batch)#
Encode inputs to be quantized into the top feature map.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
- Returns:
Encoded feature maps.
- class anomalib.models.image.dsr.torch_model.FeatureDecoder(base_width, out_channels=1)#
Bases:
Module
Feature decoder for the subspace restriction network.
- Parameters:
- forward(_, __, b3)#
Decode a batch of latent features to a non-quantized representation.
- Parameters:
_ (torch.Tensor) – Top latent feature layer.
__ (torch.Tensor) – Middle latent feature layer.
b3 (torch.Tensor) – Bottom latent feature layer.
- Return type:
- Returns:
Decoded non-quantized representation.
- class anomalib.models.image.dsr.torch_model.FeatureEncoder(in_channels, base_width)#
Bases:
Module
Feature encoder for the subspace restriction network.
- Parameters:
- class anomalib.models.image.dsr.torch_model.ImageReconstructionNetwork(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
Image Reconstruction Network.
Image reconstruction network that reconstructs the image from a quantized representation.
- Parameters:
- forward(inputs)#
Reconstructs an image from a quantized representation.
- Parameters:
inputs (torch.Tensor) – Quantized features.
- Return type:
- Returns:
Reconstructed image.
- class anomalib.models.image.dsr.torch_model.Residual(in_channels, out_channels, num_residual_hiddens)#
Bases:
Module
Residual layer.
- Parameters:
- forward(batch)#
Compute residual layer.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
- Returns:
Computed feature maps.
- class anomalib.models.image.dsr.torch_model.ResidualStack(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
Stack of residual layers.
- Parameters:
- forward(batch)#
Compute residual stack.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
- Returns:
Computed feature maps.
- class anomalib.models.image.dsr.torch_model.SubspaceRestrictionModule(base_width)#
Bases:
Module
Subspace Restriction Module.
Subspace restriction module that restricts the appearance subspace into configurations that agree with normal appearances and applies quantization.
- Parameters:
base_width (int) – Base dimensionality of the layers of the autoencoder.
- forward(batch, quantization)#
Generate the quantized anomaly-free representation of an anomalous image.
- Parameters:
batch (torch.Tensor) – Batch of input images.
quantization (function | object) – Quantization function.
- Return type:
- Returns:
Reconstructed batch of non-quantized features and corresponding quantized features.
- class anomalib.models.image.dsr.torch_model.SubspaceRestrictionNetwork(in_channels=64, out_channels=64, base_width=64)#
Bases:
Module
Subspace Restriction Network.
Subspace restriction network that reconstructs the input image into a non-quantized configuration that agrees with normal appearances.
- Parameters:
- forward(batch)#
Reconstruct non-quantized representation from batch.
Generate non-quantized feature maps from potentially anomalous images, to be quantized into non-anomalous quantized representations.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
- Returns:
Reconstructed non-quantized representation.
- class anomalib.models.image.dsr.torch_model.UnetDecoder(base_width, out_channels=1)#
Bases:
Module
Decoder of the Unet network.
- Parameters:
- forward(b1, b2, b3, b4)#
Decodes latent represnetations into an image.
- Parameters:
b1 (torch.Tensor) – First (top level) quantized feature map.
b2 (torch.Tensor) – Second quantized feature map.
b3 (torch.Tensor) – Third quantized feature map.
b4 (torch.Tensor) – Fourth (bottom level) quantized feature map.
- Return type:
- Returns:
Reconstructed image.
- class anomalib.models.image.dsr.torch_model.UnetEncoder(in_channels, base_width)#
Bases:
Module
Encoder of the Unet network.
- Parameters:
- class anomalib.models.image.dsr.torch_model.UnetModel(in_channels=64, out_channels=64, base_width=64)#
Bases:
Module
Autoencoder model that reconstructs the input image.
- Parameters:
- forward(batch)#
Reconstructs an input batch of images.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
- Returns:
Reconstructed images.
- class anomalib.models.image.dsr.torch_model.UpsamplingModule(in_channels=8, out_channels=2, base_width=64)#
Bases:
Module
Module that upsamples the generated anomaly mask to full resolution.
- Parameters:
- forward(batch_real, batch_anomaly, batch_segmentation_map)#
Computes upsampled segmentation maps.
- Parameters:
batch_real (torch.Tensor) – Batch of real, non defective images.
batch_anomaly (torch.Tensor) – Batch of potentially anomalous images.
batch_segmentation_map (torch.Tensor) – Batch of anomaly segmentation maps.
- Return type:
- Returns:
Upsampled anomaly segmentation maps.
- class anomalib.models.image.dsr.torch_model.VectorQuantizer(num_embeddings, embedding_dim)#
Bases:
Module
Module that quantizes a given feature map using learned quantization codebooks.
- Parameters:
- forward(inputs)#
Calculates quantized feature map.
- Parameters:
inputs (torch.Tensor) – Non-quantized feature maps.
- Return type:
- Returns:
Quantized feature maps.
DSR - A Dual Subspace Re-Projection Network for Surface Anomaly Detection.
This module implements the DSR model for surface anomaly detection. DSR uses a dual subspace re-projection approach to detect anomalies by comparing input images with their reconstructions in two different subspaces.
The model consists of three training phases: 1. A discrete model pre-training phase (using pre-trained weights) 2. Training of the main reconstruction and anomaly detection modules 3. Training of the upsampling module
Paper: https://link.springer.com/chapter/10.1007/978-3-031-19821-2_31
Example
>>> from anomalib.models.image import Dsr
>>> model = Dsr(
... latent_anomaly_strength=0.2,
... upsampling_train_ratio=0.7
... )
The model can be used with any of the supported datasets and task modes in anomalib.
Notes
The model requires pre-trained weights for the discrete model which are downloaded automatically during training.
See also
anomalib.models.image.dsr.torch_model.DsrModel
:PyTorch implementation of the DSR model architecture.
- class anomalib.models.image.dsr.lightning_model.Dsr(latent_anomaly_strength=0.2, upsampling_train_ratio=0.7, pre_processor=True, post_processor=True, evaluator=True, visualizer=True)#
Bases:
AnomalibModule
DSR: A Dual Subspace Re-Projection Network for Surface Anomaly Detection.
The model uses a dual subspace approach with three training phases: 1. Pre-trained discrete model (loaded from weights) 2. Training of reconstruction and anomaly detection modules 3. Training of the upsampling module for final anomaly map generation
- Parameters:
latent_anomaly_strength (float, optional) – Strength of the generated anomalies in the latent space. Defaults to
0.2
.upsampling_train_ratio (float, optional) – Ratio of training steps for the upsampling module. Defaults to
0.7
.pre_processor (PreProcessor | bool, optional) – Pre-processor instance or flag to use default. Defaults to
True
.post_processor (PostProcessor | bool, optional) – Post-processor instance or flag to use default. Defaults to
True
.evaluator (Evaluator | bool, optional) – Evaluator instance or flag to use default. Defaults to
True
.visualizer (Visualizer | bool, optional) – Visualizer instance or flag to use default. Defaults to
True
.
Example
>>> from anomalib.models.image import Dsr >>> model = Dsr( ... latent_anomaly_strength=0.2, ... upsampling_train_ratio=0.7 ... ) >>> model.trainer_arguments {'num_sanity_val_steps': 0}
- configure_optimizers()#
Configure the Adam optimizer for training phases 2 and 3.
Does not train the discrete model (phase 1)
- Returns:
Dictionary containing optimizers and schedulers.
- Return type:
dict[str, torch.optim.Optimizer | torch.optim.lr_scheduler.LRScheduler]
Example
>>> model = Dsr() >>> optimizers = model.configure_optimizers() >>> isinstance(optimizers, tuple) True >>> len(optimizers) 2
- static configure_transforms(image_size=None)#
Configure default transforms for DSR.
Normalization is not needed as the images are scaled to [0, 1] in Dataset.
- Parameters:
image_size (tuple[int, int] | None, optional) – Input image size. Defaults to
(256, 256)
.- Returns:
Composed transforms
- Return type:
Transform
Example
>>> model = Dsr() >>> transforms = model.configure_transforms((512, 512)) >>> isinstance(transforms, Transform) True
- property learning_type: LearningType#
Return the learning type of the model.
- Returns:
Learning type of the model.
- Return type:
LearningType
Example
>>> model = Dsr() >>> model.learning_type <LearningType.ONE_CLASS: 'one_class'>
- on_train_epoch_start()#
Display a message when starting to train the upsampling module.
- Return type:
- on_train_start()#
Load pretrained weights of the discrete model when starting training.
- Return type:
- static prepare_pretrained_model()#
Download pre-trained models if they don’t exist.
- Returns:
Path to the downloaded pre-trained model weights.
- Return type:
Path
Example
>>> model = Dsr() >>> weights_path = model.prepare_pretrained_model() >>> weights_path.name 'vq_model_pretrained_128_4096.pckl'
- property trainer_arguments: dict[str, Any]#
Required trainer arguments.
Example
>>> model = Dsr() >>> model.trainer_arguments {'num_sanity_val_steps': 0}
- training_step(batch)#
Training Step of DSR.
During the first phase, feeds the original image and simulated anomaly mask. During second phase, feeds a generated anomalous image to train the upsampling module.
- Parameters:
batch (Batch) – Input batch containing image, label and mask
- Returns:
Dictionary containing the loss value
- Return type:
STEP_OUTPUT
Example
>>> from anomalib.data import Batch >>> model = Dsr() >>> batch = Batch( ... image=torch.randn(8, 3, 256, 256), ... label=torch.zeros(8) ... ) >>> output = model.training_step(batch) >>> isinstance(output, dict) True >>> "loss" in output True
- validation_step(batch, *args, **kwargs)#
Validation step of DSR.
The Softmax predictions of the anomalous class are used as anomaly map.
- Parameters:
batch (Batch) – Input batch containing image, label and mask
*args – Additional positional arguments (unused)
**kwargs – Additional keyword arguments (unused)
- Returns:
Dictionary containing predictions and batch information
- Return type:
STEP_OUTPUT
Example
>>> from anomalib.data import Batch >>> model = Dsr() >>> batch = Batch( ... image=torch.randn(8, 3, 256, 256), ... label=torch.zeros(8) ... ) >>> output = model.validation_step(batch) >>> isinstance(output, Batch) True
Anomaly generator for the DSR model implementation.
This module implements an anomaly generator that creates synthetic anomalies using Perlin noise. The generator is used during the second phase of DSR model training to create anomalous samples.
Example
>>> from anomalib.models.image.dsr.anomaly_generator import DsrAnomalyGenerator
>>> generator = DsrAnomalyGenerator(p_anomalous=0.5)
>>> batch = torch.randn(8, 3, 256, 256)
>>> masks = generator.augment_batch(batch)
- class anomalib.models.image.dsr.anomaly_generator.DsrAnomalyGenerator(p_anomalous=0.5)#
Bases:
Module
Anomaly generator for the DSR model.
The generator creates synthetic anomalies by applying Perlin noise to images. It is used during the second phase of DSR model training. The third phase uses a different approach with smudge-based anomalies.
- Parameters:
p_anomalous (float, optional) – Probability of generating an anomalous image. Defaults to
0.5
.
Example
>>> generator = DsrAnomalyGenerator(p_anomalous=0.7) >>> batch = torch.randn(4, 3, 256, 256) >>> masks = generator.augment_batch(batch) >>> assert masks.shape == (4, 1, 256, 256)
- augment_batch(batch)#
Generate anomalous masks for a batch of images.
- Parameters:
batch (Tensor) – Input batch of images of shape
(batch_size, channels, height, width)
.- Returns:
- Batch of binary masks of shape
(batch_size, 1, height, width)
where1
indicates anomalous regions.
- Return type:
Tensor
Example
>>> generator = DsrAnomalyGenerator() >>> batch = torch.randn(8, 3, 256, 256) >>> masks = generator.augment_batch(batch) >>> assert masks.shape == (8, 1, 256, 256) >>> assert torch.all((masks >= 0) & (masks <= 1))
- generate_anomaly(height, width)#
Generate an anomalous mask using Perlin noise.
- Parameters:
- Returns:
- Binary mask of shape
(1, height, width)
where1
indicates anomalous regions.
- Binary mask of shape
- Return type:
Tensor
Example
>>> generator = DsrAnomalyGenerator() >>> mask = generator.generate_anomaly(256, 256) >>> assert mask.shape == (1, 256, 256) >>> assert torch.all((mask >= 0) & (mask <= 1))
Loss functions for the DSR model implementation.
This module contains the loss functions used in the second and third training phases of the DSR model.
Example
>>> from anomalib.models.image.dsr.loss import DsrSecondStageLoss
>>> loss_fn = DsrSecondStageLoss()
>>> loss = loss_fn(
... recon_nq_hi=recon_nq_hi,
... recon_nq_lo=recon_nq_lo,
... qu_hi=qu_hi,
... qu_lo=qu_lo,
... input_image=input_image,
... gen_img=gen_img,
... seg=seg,
... anomaly_mask=anomaly_mask
... )
- class anomalib.models.image.dsr.loss.DsrSecondStageLoss#
Bases:
Module
Loss function for the second training phase of the DSR model.
- The total loss is a combination of:
MSE loss between non-anomalous quantized input image and anomalous subspace-reconstructed non-quantized input (hi and lo features)
MSE loss between input image and reconstructed image through object-specific decoder
Focal loss between computed segmentation mask and ground truth mask
Example
>>> loss_fn = DsrSecondStageLoss() >>> loss = loss_fn( ... recon_nq_hi=recon_nq_hi, ... recon_nq_lo=recon_nq_lo, ... qu_hi=qu_hi, ... qu_lo=qu_lo, ... input_image=input_image, ... gen_img=gen_img, ... seg=seg, ... anomaly_mask=anomaly_mask ... )
- forward(recon_nq_hi, recon_nq_lo, qu_hi, qu_lo, input_image, gen_img, seg, anomaly_mask)#
Compute the combined loss over a batch.
- Parameters:
recon_nq_hi (Tensor) – Reconstructed non-quantized hi feature
recon_nq_lo (Tensor) – Reconstructed non-quantized lo feature
qu_hi (Tensor) – Non-defective quantized hi feature
qu_lo (Tensor) – Non-defective quantized lo feature
input_image (Tensor) – Original input image
gen_img (Tensor) – Object-specific decoded image
seg (Tensor) – Computed anomaly segmentation map
anomaly_mask (Tensor) – Ground truth anomaly mask
- Returns:
Total combined loss value
- Return type:
Tensor
Example
>>> loss_fn = DsrSecondStageLoss() >>> loss = loss_fn( ... recon_nq_hi=torch.randn(32, 64, 32, 32), ... recon_nq_lo=torch.randn(32, 64, 32, 32), ... qu_hi=torch.randn(32, 64, 32, 32), ... qu_lo=torch.randn(32, 64, 32, 32), ... input_image=torch.randn(32, 3, 256, 256), ... gen_img=torch.randn(32, 3, 256, 256), ... seg=torch.randn(32, 2, 256, 256), ... anomaly_mask=torch.randint(0, 2, (32, 1, 256, 256)) ... )
- class anomalib.models.image.dsr.loss.DsrThirdStageLoss#
Bases:
Module
Loss function for the third training phase of the DSR model.
The loss consists of a focal loss between the computed segmentation mask and the ground truth mask.
Example
>>> loss_fn = DsrThirdStageLoss() >>> loss = loss_fn( ... pred_mask=pred_mask, ... true_mask=true_mask ... )
- forward(pred_mask, true_mask)#
Compute the focal loss between predicted and true masks.
- Parameters:
pred_mask (Tensor) – Computed anomaly segmentation map
true_mask (Tensor) – Ground truth anomaly mask
- Returns:
Focal loss value
- Return type:
Tensor
Example
>>> loss_fn = DsrThirdStageLoss() >>> loss = loss_fn( ... pred_mask=torch.randn(32, 2, 256, 256), ... true_mask=torch.randint(0, 2, (32, 1, 256, 256)) ... )