MVTecLOCO Datamodule

Contents

MVTecLOCO Datamodule#

MVTec LOCO Data Module.

This module provides a PyTorch Lightning DataModule for the MVTec LOCO dataset. The dataset contains 5 categories of industrial objects with both normal and anomalous samples. Each category includes RGB images and pixel-level ground truth masks for anomaly segmentation.

The dataset distinguishes between structural anomalies (local defects) and logical anomalies (global defects).

Example

Create a MVTec LOCO datamodule:

>>> from anomalib.data import MVTecLOCO
>>> datamodule = MVTecLOCO(
...     root="./datasets/MVTec_LOCO",
...     category="breakfast_box"
... )

Notes

The dataset will be automatically downloaded and converted to the required format when first used. The directory structure after preparation will be:

datasets/
└── MVTec_LOCO/
    ├── breakfast_box/
    ├── juice_bottle/
    └── ...
License:

MVTec LOCO dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). https://creativecommons.org/licenses/by-nc-sa/4.0/

Reference:

Bergmann, P., Fauser, M., Sattlegger, D., & Steger, C. (2022). MVTec LOCO - A Dataset for Detecting Logical Anomalies in Images. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.

class anomalib.data.datamodules.image.mvtec_loco.MVTecLOCO(root='./datasets/MVTec_LOCO', category='breakfast_box', train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, test_split_mode=TestSplitMode.FROM_DIR, val_split_mode=ValSplitMode.FROM_DIR, test_split_ratio=None, val_split_ratio=None, seed=None)#

Bases: AnomalibDataModule

MVTec LOCO Datamodule.

Parameters:
  • root (Path | str) – Path to the root of the dataset. Defaults to "./datasets/MVTec_LOCO".

  • category (str) – Category of the MVTec LOCO dataset (e.g. "breakfast_box" or "juice_bottle"). Defaults to "breakfast_box".

  • train_batch_size (int, optional) – Training batch size. Defaults to 32.

  • eval_batch_size (int, optional) – Test batch size. Defaults to 32.

  • num_workers (int, optional) – Number of workers. Defaults to 8.

  • train_augmentations (Transform | None) – Augmentations to apply to the training images Defaults to None.

  • val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to None.

  • test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to None.

  • augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided. Defaults to None.

  • test_split_mode (TestSplitMode | str) – Method to create test set. Defaults to TestSplitMode.FROM_DIR.

  • val_split_mode (ValSplitMode | str) – Method to create validation set. Defaults to ValSplitMode.FROM_DIR.

  • test_split_ratio (float | None) – Fraction of data to use for testing. Defaults to None.

  • val_split_ratio (float | None) – Fraction of data to use for validation. Defaults to None.

  • seed (int | None, optional) – Seed for reproducibility. Defaults to None.

Example

Create MVTec LOCO datamodule with default settings:

>>> datamodule = MVTecLOCO()
>>> datamodule.setup()
>>> i, data = next(enumerate(datamodule.train_dataloader()))
>>> data.keys()
dict_keys(['image_path', 'label', 'image', 'mask_path', 'mask'])

>>> data["image"].shape
torch.Size([32, 3, 256, 256])

Change the category:

>>> datamodule = MVTecLOCO(category="juice_bottle")

Create validation set from test data:

>>> datamodule = MVTecLOCO(
...     val_split_mode=ValSplitMode.FROM_TEST,
...     val_split_ratio=0.1
... )

Create synthetic validation set:

>>> datamodule = MVTecLOCO(
...     val_split_mode=ValSplitMode.SYNTHETIC,
...     val_split_ratio=0.2
... )

See also

../../datasets/image/mvtecloco - MVTec LOCO Dataset