L2BT

L2BT#

Architecture#

Learning to Be a Transformer to Pinpoint Anomalies.

This module implements the L2BT model for anomaly detection as described in Costanzino et al. (2025).

The model consists of:

A pre-trained Vision Transformer teacher that extracts patch embeddings
Two shallow student MLPs (backward_net and forward_net) that learn to match teacher patch embeddings
Feature distillation between teacher and student representations
Anomaly detection based on student ability to reconstruct teacher features

Example

>>> from anomalib.models.image import L2BT
>>> from anomalib.engine import Engine
>>> from anomalib.data import MVTecAD
>>> datamodule = MVTecAD()
>>> model = L2BT(
...     layers=(7, 11),
...     topk_ratio=0.001
... )
>>> engine = Engine(model=model, datamodule=datamodule)
>>> engine.fit()
>>> predictions = engine.predict()

See also

L2BT: Lightning implementation of the model
L2BTModel: PyTorch implementation of the model architecture

class anomalib.models.image.l2bt.lightning_model.L2BT(lr=0.0001, layers=(7, 11), blur_w_l=5, blur_w_u=7, blur_pad_l=2, blur_pad_u=3, blur_repeats_l=5, blur_repeats_u=3, topk_ratio=0.001, pre_processor=True, post_processor=True, evaluator=True, visualizer=True)#

Bases: AnomalibModule

Learning to Be a Transformer algorithm.

The L2BT model consists of a pre-trained Vision Transformer teacher that extracts patch embeddings and two shallow student MLPs (backward_net and forward_net) that learn to match the teacher’s patch embeddings. The model detects anomalies by comparing the student’s ability to reconstruct teacher embeddings on normal images, where degradation indicates anomalies.

Parameters:

lr (float) – Learning rate for student network optimization. Defaults to 1e-4.
layers (Sequence[int]) – Indices of Vision Transformer layers used for feature extraction. Must be a sequence of exactly two indices. Defaults to (7, 11).
blur_w_l (int) – Lower bound for blur kernel width in augmentation. Defaults to 5.
blur_w_u (int) – Upper bound for blur kernel width in augmentation. Defaults to 7.
blur_pad_l (int) – Lower bound for blur padding in augmentation. Defaults to 2.
blur_pad_u (int) – Upper bound for blur padding in augmentation. Defaults to 3.
blur_repeats_l (int) – Number of repetitions for lower blur kernel. Defaults to 3.
blur_repeats_u (int) – Number of repetitions for upper blur kernel. Defaults to 5.
topk_ratio (float) – Fraction of highest anomaly-map values to use for image-level anomaly scoring. Defaults to 0.001.
pre_processor (PreProcessor | bool) – Pre-processor to transform input data before passing to model. If True, uses default. Defaults to True.
post_processor (PostProcessor | bool) – Post-processor to generate predictions from model outputs. If True, uses default. Defaults to True.
evaluator (Evaluator | bool) – Evaluator to compute metrics. If True, uses default. Defaults to True.
visualizer (Visualizer | bool) – Visualizer to display results. If True, uses default. Defaults to True.

Example

>>> from anomalib.models.image import L2BT
>>> from anomalib.data import MVTecAD
>>> from anomalib.engine import Engine
>>> datamodule = MVTecAD()
>>> model = L2BT(
...     layers=(7, 11),
...     topk_ratio=0.001
... )
>>> engine = Engine(model=model, datamodule=datamodule)
>>> engine.fit()
>>> predictions = engine.predict()

L2BT

Contents

L2BT#

Architecture#