PatchFlow#

Architecture#

PatchFlow Architecture

PatchFlow Lightning Model Implementation.

This model implements the PatchFlow algorithm for anomaly detection and localization. PatchFlow models the distribution of patch-level feature embeddings using a learnable normalizing flow.

The model extracts patch embeddings from intermediate layers of a pretrained backbone. It then passes them through a feature adaptor, which projects and aligns features into a space suited for density estimation. These patch features are then transformed by a flow-based density estimator that learns the distribution of normal data.

During inference, the model computes the likelihood of each patch under the learned flow. Low-likelihood patches indicate anomalies. The patch-level scores are aggregated to produce both an image-level anomaly score and a pixel-wise anomaly map.

Paper: https://arxiv.org/abs/2602.05238

Example

>>> from anomalib.data import MVTecAD
>>> from anomalib.models import Patchflow
>>> from anomalib.engine import Engine
>>> datamodule = MVTecAD()
>>> model = Patchflow()
>>> engine = Engine()
>>> engine.fit(model, datamodule=datamodule)
>>> predictions = engine.predict(model, datamodule=datamodule)
class anomalib.models.image.patchflow.lightning_model.Patchflow(backbone='tf_efficientnet_b5', pre_trained=True, flow_steps=1, flow_feature_dim=128, num_scales=3, patch_size=3, flow_hidden_dim=128, crop_size=None, lr=0.001, weight_decay=0.0001, pre_processor=True, post_processor=True, evaluator=True, visualizer=True)#

Bases: AnomalibModule

Lightning Module for the PatchFlow algorithm.

Parameters:
  • backbone (str) – Backbone network. A timm model name or a DINOv2 model name. Defaults to "tf_efficientnet_b5".

  • pre_trained (bool) – Whether to use pre-trained backbone weights. Defaults to True.

  • flow_steps (int) – Number of coupling blocks in the normalizing flow. Defaults to 1.

  • flow_feature_dim (int) – Channel dimension after the feature adaptor. Defaults to 128.

  • num_scales (int) – Number of input resolutions for multi-scale extraction. Defaults to 3.

  • patch_size (int) – Kernel size of the AvgPool for local aggregation. Defaults to 3.

  • flow_hidden_dim (int) – Hidden channels in the flow subnet. Defaults to 128.

  • crop_size (tuple[int, int] | None) – Optional center crop size (H, W). When set, the input is center-cropped before feature extraction and the anomaly map is padded with value -1 back to the original input size. Defaults to None.

  • lr (float) – Learning rate for the Adam optimizer. Defaults to 0.001.

  • weight_decay (float) – Weight decay (L2 penalty) for the Adam optimizer. Defaults to 0.0001.

  • pre_processor (PreProcessor | bool) – Pre-processor instance or boolean. Defaults to True.

  • post_processor (PostProcessor | bool) – Post-processor instance or boolean. Defaults to True.

  • evaluator (Evaluator | bool) – Evaluator instance or boolean. Defaults to True.

  • visualizer (Visualizer | bool) – Visualizer instance or boolean. Defaults to True.

static configure_evaluator()#

Configure default evaluator with image/pixel AUROC and F1.

Return type:

Evaluator

configure_optimizers()#

Configure optimizer.

Return type:

Optimizer

Returns:

Adam optimizer.

static configure_pre_processor(image_size=None)#

Configure the default pre-processor for PatchFlow.

Matches the official repo transforms (Resize + ImageNet Normalize). BOX-LEO/PatchFlow

Note

The default (768, 768) is not divisible by the DINOv2 ViT patch size (14). When a DINOv2 backbone is used, the torch model automatically center-crops the input to the nearest patch-aligned size, so no manual crop_size is required.

Parameters:

image_size (tuple[int, int] | None) – Target image size. Defaults to (768, 768) per the paper (Section 4.3).

Return type:

PreProcessor

Returns:

PreProcessor with the configured transforms.

property learning_type: LearningType#

Return the learning type of the model.

Returns:

One-class learning.

Return type:

LearningType

property trainer_arguments: dict[str, Any]#

Return PatchFlow trainer arguments.

training_step(batch, *args, **kwargs)#

Perform the training step.

Parameters:
  • batch (Batch) – Input batch.

  • args – Additional arguments.

  • kwargs – Additional keyword arguments.

Return type:

Union[Tensor, Mapping[str, Any], None]

Returns:

Dictionary containing the loss value.

validation_step(batch, *args, **kwargs)#

Perform the validation step.

Parameters:
  • batch (Batch) – Input batch.

  • args – Additional arguments.

  • kwargs – Additional keyword arguments.

Return type:

Union[Tensor, Mapping[str, Any], None]

Returns:

Batch dictionary containing anomaly maps.

PatchFlow Torch Model Implementation.

PatchFlow fuses multi-scale backbone features into a single tensor, compresses them with a 1x1 convolution adaptor, and runs a single normalizing flow for anomaly detection.

class anomalib.models.image.patchflow.torch_model.PatchflowModel(input_size, backbone='tf_efficientnet_b5', pre_trained=True, flow_steps=1, flow_feature_dim=128, num_scales=3, patch_size=3, flow_hidden_dim=128, crop_size=None)#

Bases: Module

PatchFlow model for anomaly detection.

Extracts multi-scale features from a frozen backbone, fuses them into a single tensor, adapts the channel dimension, and passes through a single normalizing flow.

Parameters:
  • input_size (tuple[int, int]) – Model input size (H, W).

  • backbone (str) – timm model name or DINOv2 model name (e.g. "tf_efficientnet_b5" or "dinov2_vit_base_14").

  • pre_trained (bool) – Whether to use pre-trained backbone weights.

  • flow_steps (int) – Number of coupling blocks in the normalizing flow.

  • flow_feature_dim (int) – Channel dimension after the feature adaptor.

  • num_scales (int) – Number of input resolutions for multi-scale extraction.

  • patch_size (int) – Kernel size of the AvgPool for local aggregation.

  • flow_hidden_dim (int) – Hidden channels in the flow subnet.

  • crop_size (tuple[int, int] | None) – Optional center crop size (H, W) applied before feature extraction. The anomaly map is padded with -1 back to input_size. Defaults to None (no cropping).

forward(input_tensor)#

Forward pass.

Parameters:

input_tensor (Tensor) – Input image tensor of shape (B, 3, H, W).

Returns:

(hidden_variables, log_jacobians) as single tensors. During inference: InferenceBatch with pred_score and

anomaly_map.

Return type:

tuple[Tensor, Tensor] | InferenceBatch

Loss function for the PatchFlow Model Implementation.

PatchFlow uses a single normalizing flow on fused features, so the loss operates on a single (B, C, H, W) hidden variable tensor and (B,) jacobians, normalized by the flattened feature dimension.

class anomalib.models.image.patchflow.loss.PatchflowLoss(*args, **kwargs)#

Bases: Module

PatchFlow Loss Module.

Computes the negative log-likelihood loss normalized by the feature dimension of the fused hidden variable tensor.

static forward(hidden_variables, jacobians)#

Calculate the PatchFlow loss.

Parameters:
  • hidden_variables (Tensor) – Hidden variable tensor of shape (B, C, H, W) from the normalizing flow.

  • jacobians (Tensor) – Log determinant of the Jacobian of shape (B,).

Returns:

Scalar loss value.

Return type:

Tensor

PatchFlow Anomaly Map Generator Implementation.

This module generates anomaly heatmaps from the single fused hidden variable tensor produced by the PatchFlow normalizing flow.

class anomalib.models.image.patchflow.anomaly_map.AnomalyMapGenerator(input_size)#

Bases: Module

Generate anomaly heatmap from PatchFlow hidden variables.

Unlike FastFlow which averages maps from multiple flow blocks, PatchFlow operates on a single fused tensor.

Parameters:

input_size (ListConfig | tuple) – Target size (height, width) for the anomaly map.

forward(hidden_variables)#

Generate anomaly heatmap from hidden variables.

Parameters:

hidden_variables (Tensor) – Hidden variable tensor of shape (N, C, H, W) from the normalizing flow.

Returns:

Anomaly heatmap of shape (N, 1, H, W).

Return type:

Tensor