DSR#
This is the implementation of the DSR paper.
Model Type: Segmentation
Description#
DSR is a quantized-feature based algorithm that consists of an autoencoder with one encoder and two decoders, coupled with an anomaly detection module. DSR learns a codebook of quantized representations on ImageNet, which are then used to encode input images. These quantized representations also serve to sample near-in-distribution anomalies, since they do not rely on external datasets. Training takes place in three phases. The encoder and “general object decoder”, as well as the codebook, are pretrained on ImageNet. Defects are then generated at the feature level using the codebook on the quantized representations, and are used to train the object-specific decoder as well as the anomaly detection module. In the final phase of training, the upsampling module is trained on simulated image-level smudges in order to output more robust anomaly maps.
Architecture#
PyTorch model for the DSR model implementation.
- class anomalib.models.image.dsr.torch_model.AnomalyDetectionModule(in_channels, out_channels, base_width)#
Bases:
Module
Anomaly detection module.
Module that detects the preseßnce of an anomaly by comparing two images reconstructed by the object specific decoder and the general object decoder.
- Parameters:
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
base_width (int) – Base dimensionality of the layers of the autoencoder.
- forward(batch_real, batch_anomaly)#
Computes the anomaly map over corresponding real and anomalous images.
- Parameters:
batch_real (torch.Tensor) – Batch of real, non defective images.
batch_anomaly (torch.Tensor) – Batch of potentially anomalous images.
- Return type:
Tensor
- Returns:
The anomaly segmentation map.
- class anomalib.models.image.dsr.torch_model.DecoderBot(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
General appearance decoder module to reconstruct images while keeping possible anomalies.
- Parameters:
in_channels (int) – Number of input channels.
num_hiddens (int) – Number of hidden channels.
num_residual_layers (int) – Number of residual layers in residual stack.
num_residual_hiddens (int) – Number of channels in residual layers.
- forward(inputs)#
Decode quantized feature maps into an image.
- Parameters:
inputs (torch.Tensor) – Quantized feature maps.
- Return type:
Tensor
- Returns:
Decoded image.
- class anomalib.models.image.dsr.torch_model.DiscreteLatentModel(num_hiddens, num_residual_layers, num_residual_hiddens, num_embeddings, embedding_dim)#
Bases:
Module
Discrete Latent Model.
Autoencoder quantized model that encodes the input images into quantized feature maps and generates a reconstructed image using the general appearance decoder.
- Parameters:
num_hiddens (int) – Number of hidden channels.
num_residual_layers (int) – Number of residual layers in residual stacks.
num_residual_hiddens (int) – Number of channels in residual layers.
num_embeddings (int) – Size of embedding dictionary.
embedding_dim (int) – Dimension of embeddings.
- forward(batch, anomaly_mask=None, anom_str_lo=None, anom_str_hi=None)#
Generate quantized feature maps.
Generates quantized feature maps of batch of input images as well as their reconstruction based on the general appearance decoder.
- Parameters:
batch (Tensor) – Batch of input images.
anomaly_mask (Tensor | None) – Anomaly mask to be used to generate anomalies on the quantized feature maps.
anom_str_lo (torch.Tensor | None) – Strength of generated anomaly lo.
anom_str_hi (torch.Tensor | None) – Strength of generated anomaly hi.
- Returns:
- If generating an anomaly mask:
General object decoder-decoded anomalous image
Reshaped ground truth anomaly map
Non defective quantized lo feature
Non defective quantized hi feature
Non quantized subspace encoded defective lo feature
Non quantized subspace encoded defective hi feature
- Else:
General object decoder-decoded image
Quantized lo feature
Quantized hi feature
- Return type:
dict[str, torch.Tensor]
- static generate_fake_anomalies_joined(features, embeddings, memory_torch_original, mask, strength)#
Generate quantized anomalies.
- Parameters:
features (torch.Tensor) – Features on which the anomalies will be generated.
embeddings (torch.Tensor) – Embeddings to use to generate the anomalies.
memory_torch_original (torch.Tensor) – Weight of embeddings.
mask (torch.Tensor) – Original anomaly mask.
strength (float) – Strength of generated anomaly.
- Returns:
Anomalous embedding.
- Return type:
torch.Tensor
- property vq_vae_bot: VectorQuantizer#
Return
self._vq_vae_bot
.
- property vq_vae_top: VectorQuantizer#
Return
self._vq_vae_top
.
- class anomalib.models.image.dsr.torch_model.DsrModel(latent_anomaly_strength=0.2, embedding_dim=128, num_embeddings=4096, num_hiddens=128, num_residual_layers=2, num_residual_hiddens=64)#
Bases:
Module
DSR PyTorch model.
Consists of the discrete latent model, image reconstruction network, subspace restriction modules, anomaly detection module and upsampling module.
- Parameters:
embedding_dim (int) – Dimension of codebook embeddings.
num_embeddings (int) – Number of embeddings.
latent_anomaly_strength (float) – Strength of the generated anomalies in the latent space.
num_hiddens (int) – Number of output channels in residual layers.
num_residual_layers (int) – Number of residual layers.
num_residual_hiddens (int) – Number of intermediate channels.
- forward(batch, anomaly_map_to_generate=None)#
Compute the anomaly mask from an input image.
- Parameters:
batch (torch.Tensor) – Batch of input images.
anomaly_map_to_generate (torch.Tensor | None) – anomaly map to use to generate quantized defects.
2 (If not training phase)
None. (should be)
- Returns:
- If testing:
”anomaly_map”: Upsampled anomaly map
”pred_score”: Image score
- If training phase 2:
”recon_feat_hi”: Reconstructed non-quantized hi features of defect (F~_hi)
”recon_feat_lo”: Reconstructed non-quantized lo features of defect (F~_lo)
”embedding_bot”: Quantized features of non defective img (Q_hi)
”embedding_top”: Quantized features of non defective img (Q_lo)
”obj_spec_image”: Object-specific-decoded image (I_spc)
”anomaly_map”: Predicted segmentation mask (M)
”true_mask”: Resized ground-truth anomaly map (M_gt)
- If training phase 3:
”anomaly_map”: Reconstructed anomaly map
- Return type:
dict[str, torch.Tensor]
- load_pretrained_discrete_model_weights(ckpt, device=None)#
Load pre-trained model weights.
- Return type:
None
- class anomalib.models.image.dsr.torch_model.EncoderBot(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
Encoder module for bottom quantized feature maps.
- Parameters:
in_channels (int) – Number of input channels.
num_hiddens (int) – Number of hidden channels.
num_residual_layers (int) – Number of residual layers in residual stacks.
num_residual_hiddens (int) – Number of channels in residual layers.
- forward(batch)#
Encode inputs to be quantized into the bottom feature map.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
Tensor
- Returns:
Encoded feature maps.
- class anomalib.models.image.dsr.torch_model.EncoderTop(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
Encoder module for top quantized feature maps.
- Parameters:
in_channels (int) – Number of input channels.
num_hiddens (int) – Number of hidden channels.
num_residual_layers (int) – Number of residual layers in residual stacks.
num_residual_hiddens (int) – Number of channels in residual layers.
- forward(batch)#
Encode inputs to be quantized into the top feature map.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
Tensor
- Returns:
Encoded feature maps.
- class anomalib.models.image.dsr.torch_model.FeatureDecoder(base_width, out_channels=1)#
Bases:
Module
Feature decoder for the subspace restriction network.
- Parameters:
base_width (int) – Base dimensionality of the layers of the autoencoder.
out_channels (int) – Number of output channels.
- forward(_, __, b3)#
Decode a batch of latent features to a non-quantized representation.
- Parameters:
_ (torch.Tensor) – Top latent feature layer.
__ (torch.Tensor) – Middle latent feature layer.
b3 (torch.Tensor) – Bottom latent feature layer.
- Return type:
Tensor
- Returns:
Decoded non-quantized representation.
- class anomalib.models.image.dsr.torch_model.FeatureEncoder(in_channels, base_width)#
Bases:
Module
Feature encoder for the subspace restriction network.
- Parameters:
in_channels (int) – Number of input channels.
base_width (int) – Base dimensionality of the layers of the autoencoder.
- forward(batch)#
Encode a batch of input features to the latent space.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
tuple
[Tensor
,Tensor
,Tensor
]
Returns: Encoded feature maps.
- class anomalib.models.image.dsr.torch_model.ImageReconstructionNetwork(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
Image Reconstruction Network.
Image reconstruction network that reconstructs the image from a quantized representation.
- Parameters:
in_channels (int) – Number of input channels.
num_hiddens (int) – Number of output channels in residual layers.
num_residual_layers (int) – Number of residual layers.
num_residual_hiddens (int) – Number of intermediate channels.
- forward(inputs)#
Reconstructs an image from a quantized representation.
- Parameters:
inputs (torch.Tensor) – Quantized features.
- Return type:
Tensor
- Returns:
Reconstructed image.
- class anomalib.models.image.dsr.torch_model.Residual(in_channels, out_channels, num_residual_hiddens)#
Bases:
Module
Residual layer.
- Parameters:
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
num_residual_hiddens (int) – Number of intermediate channels.
- forward(batch)#
Compute residual layer.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
Tensor
- Returns:
Computed feature maps.
- class anomalib.models.image.dsr.torch_model.ResidualStack(in_channels, num_hiddens, num_residual_layers, num_residual_hiddens)#
Bases:
Module
Stack of residual layers.
- Parameters:
in_channels (int) – Number of input channels.
num_hiddens (int) – Number of output channels in residual layers.
num_residual_layers (int) – Number of residual layers.
num_residual_hiddens (int) – Number of intermediate channels.
- forward(batch)#
Compute residual stack.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
Tensor
- Returns:
Computed feature maps.
- class anomalib.models.image.dsr.torch_model.SubspaceRestrictionModule(base_width)#
Bases:
Module
Subspace Restriction Module.
Subspace restriction module that restricts the appearance subspace into configurations that agree with normal appearances and applies quantization.
- Parameters:
base_width (int) – Base dimensionality of the layers of the autoencoder.
- forward(batch, quantization)#
Generate the quantized anomaly-free representation of an anomalous image.
- Parameters:
batch (torch.Tensor) – Batch of input images.
quantization (function | object) – Quantization function.
- Return type:
tuple
[Tensor
,Tensor
]- Returns:
Reconstructed batch of non-quantized features and corresponding quantized features.
- class anomalib.models.image.dsr.torch_model.SubspaceRestrictionNetwork(in_channels=64, out_channels=64, base_width=64)#
Bases:
Module
Subspace Restriction Network.
Subspace restriction network that reconstructs the input image into a non-quantized configuration that agrees with normal appearances.
- Parameters:
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
base_width (int) – Base dimensionality of the layers of the autoencoder.
- forward(batch)#
Reconstruct non-quantized representation from batch.
Generate non-quantized feature maps from potentially anomalous images, to be quantized into non-anomalous quantized representations.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
Tensor
- Returns:
Reconstructed non-quantized representation.
- class anomalib.models.image.dsr.torch_model.UnetDecoder(base_width, out_channels=1)#
Bases:
Module
Decoder of the Unet network.
- Parameters:
base_width (int) – Base dimensionality of the layers of the autoencoder.
out_channels (int) – Number of output channels.
- forward(b1, b2, b3, b4)#
Decodes latent represnetations into an image.
- Parameters:
b1 (torch.Tensor) – First (top level) quantized feature map.
b2 (torch.Tensor) – Second quantized feature map.
b3 (torch.Tensor) – Third quantized feature map.
b4 (torch.Tensor) – Fourth (bottom level) quantized feature map.
- Return type:
Tensor
- Returns:
Reconstructed image.
- class anomalib.models.image.dsr.torch_model.UnetEncoder(in_channels, base_width)#
Bases:
Module
Encoder of the Unet network.
- Parameters:
in_channels (int) – Number of input channels.
base_width (int) – Base dimensionality of the layers of the autoencoder.
- forward(batch)#
Encodes batch of images into a latent representation.
- Parameters:
batch (torch.Tensor) – Quantized features.
- Return type:
tuple
[Tensor
,Tensor
,Tensor
,Tensor
]- Returns:
Latent representations of the input batch.
- class anomalib.models.image.dsr.torch_model.UnetModel(in_channels=64, out_channels=64, base_width=64)#
Bases:
Module
Autoencoder model that reconstructs the input image.
- Parameters:
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
base_width (int) – Base dimensionality of the layers of the autoencoder.
- forward(batch)#
Reconstructs an input batch of images.
- Parameters:
batch (torch.Tensor) – Batch of input images.
- Return type:
Tensor
- Returns:
Reconstructed images.
- class anomalib.models.image.dsr.torch_model.UpsamplingModule(in_channels=8, out_channels=2, base_width=64)#
Bases:
Module
Module that upsamples the generated anomaly mask to full resolution.
- Parameters:
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
base_width (int) – Base dimensionality of the layers of the autoencoder.
- forward(batch_real, batch_anomaly, batch_segmentation_map)#
Computes upsampled segmentation maps.
- Parameters:
batch_real (torch.Tensor) – Batch of real, non defective images.
batch_anomaly (torch.Tensor) – Batch of potentially anomalous images.
batch_segmentation_map (torch.Tensor) – Batch of anomaly segmentation maps.
- Return type:
Tensor
- Returns:
Upsampled anomaly segmentation maps.
- class anomalib.models.image.dsr.torch_model.VectorQuantizer(num_embeddings, embedding_dim)#
Bases:
Module
Module that quantizes a given feature map using learned quantization codebooks.
- Parameters:
num_embeddings (int) – Size of embedding codebook.
embedding_dim (int) – Dimension of embeddings.
- property embedding: Tensor#
Return embedding.
- forward(inputs)#
Calculates quantized feature map.
- Parameters:
inputs (torch.Tensor) – Non-quantized feature maps.
- Return type:
Tensor
- Returns:
Quantized feature maps.
DSR - A Dual Subspace Re-Projection Network for Surface Anomaly Detection.
Paper https://link.springer.com/chapter/10.1007/978-3-031-19821-2_31
- class anomalib.models.image.dsr.lightning_model.Dsr(latent_anomaly_strength=0.2, upsampling_train_ratio=0.7)#
Bases:
AnomalyModule
DSR: A Dual Subspace Re-Projection Network for Surface Anomaly Detection.
- Parameters:
latent_anomaly_strength (float) – Strength of the generated anomalies in the latent space. Defaults to 0.2
upsampling_train_ratio (float) – Ratio of training steps for the upsampling module. Defaults to 0.7
- configure_optimizers()#
Configure the Adam optimizer for training phases 2 and 3.
Does not train the discrete model (phase 1)
- Returns:
Dictionary of optimizers
- Return type:
dict[str, torch.optim.Optimizer | torch.optim.lr_scheduler.LRScheduler]
- static configure_transforms(image_size=None)#
Default transform for DSR. Normalization is not needed as the images are scaled to [0, 1] in Dataset.
- Return type:
Transform
- property learning_type: LearningType#
Return the learning type of the model.
- Returns:
Learning type of the model.
- Return type:
LearningType
- on_train_epoch_start()#
Display a message when starting to train the upsampling module.
- Return type:
None
- on_train_start()#
Load pretrained weights of the discrete model when starting training.
- Return type:
None
- static prepare_pretrained_model()#
Download pre-trained models if they don’t exist.
- Return type:
Path
- property trainer_arguments: dict[str, Any]#
Required trainer arguments.
- training_step(batch)#
Training Step of DSR.
Feeds the original image and the simulated anomaly mask during first phase. During second phase, feeds a generated anomalous image to train the upsampling module.
- Parameters:
batch (dict[str, str | Tensor]) – Batch containing image filename, image, label and mask
- Returns:
Loss dictionary
- Return type:
STEP_OUTPUT
- validation_step(batch, *args, **kwargs)#
Validation step of DSR.
The Softmax predictions of the anomalous class are used as anomaly map.
- Parameters:
batch (dict[str, str | Tensor]) – Batch of input images
*args – unused
**kwargs – unused
- Returns:
Dictionary to which predicted anomaly maps have been added.
- Return type:
STEP_OUTPUT
Anomaly generator for the DSR model implementation.
- class anomalib.models.image.dsr.anomaly_generator.DsrAnomalyGenerator(p_anomalous=0.5)#
Bases:
Module
Anomaly generator of the DSR model.
The anomaly is generated using a Perlin noise generator on the two quantized representations of an image. This generator is only used during the second phase of training! The third phase requires generating smudges over the input images.
- Parameters:
p_anomalous (float, optional) – Probability to generate an anomalous image.
- augment_batch(batch)#
Generate anomalous augmentations for a batch of input images.
- Parameters:
batch (Tensor) – Batch of input images
- Returns:
Ground truth masks corresponding to the anomalous perturbations.
- Return type:
Tensor
- generate_anomaly(height, width)#
Generate an anomalous mask.
- Parameters:
height (int) – Height of generated mask.
width (int) – Width of generated mask.
- Returns:
Generated mask.
- Return type:
Tensor
Loss function for the DSR model implementation.
- class anomalib.models.image.dsr.loss.DsrSecondStageLoss#
Bases:
Module
Overall loss function of the second training phase of the DSR model.
- The total loss consists of:
MSE loss between non-anomalous quantized input image and anomalous subspace-reconstructed non-quantized input (hi and lo)
MSE loss between input image and reconstructed image through object-specific decoder,
Focal loss between computed segmentation mask and ground truth mask.
- forward(recon_nq_hi, recon_nq_lo, qu_hi, qu_lo, input_image, gen_img, seg, anomaly_mask)#
Compute the loss over a batch for the DSR model.
- Parameters:
recon_nq_hi (Tensor) – Reconstructed non-quantized hi feature
recon_nq_lo (Tensor) – Reconstructed non-quantized lo feature
qu_hi (Tensor) – Non-defective quantized hi feature
qu_lo (Tensor) – Non-defective quantized lo feature
input_image (Tensor) – Original image
gen_img (Tensor) – Object-specific decoded image
seg (Tensor) – Computed anomaly map
anomaly_mask (Tensor) – Ground truth anomaly map
- Returns:
Total loss
- Return type:
Tensor
- class anomalib.models.image.dsr.loss.DsrThirdStageLoss#
Bases:
Module
Overall loss function of the third training phase of the DSR model.
The loss consists of a focal loss between the computed segmentation mask and the ground truth mask.
- forward(pred_mask, true_mask)#
Compute the loss over a batch for the DSR model.
- Parameters:
pred_mask (Tensor) – Computed anomaly map
true_mask (Tensor) – Ground truth anomaly map
- Returns:
Total loss
- Return type:
Tensor