MVTecLOCO Datamodule#
MVTec LOCO Data Module.
This module provides a PyTorch Lightning DataModule for the MVTec LOCO dataset. The dataset contains 5 categories of industrial objects with both normal and anomalous samples. Each category includes RGB images and pixel-level ground truth masks for anomaly segmentation.
The dataset distinguishes between structural anomalies (local defects) and logical anomalies (global defects).
Example
Create a MVTec LOCO datamodule:
>>> from anomalib.data import MVTecLOCO
>>> datamodule = MVTecLOCO(
... root="./datasets/MVTec_LOCO",
... category="breakfast_box"
... )
Notes
The dataset will be automatically downloaded and converted to the required format when first used. The directory structure after preparation will be:
datasets/
└── MVTec_LOCO/
├── breakfast_box/
├── juice_bottle/
└── ...
- License:
MVTec LOCO dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). https://creativecommons.org/licenses/by-nc-sa/4.0/
- Reference:
Bergmann, P., Fauser, M., Sattlegger, D., & Steger, C. (2022). MVTec LOCO - A Dataset for Detecting Logical Anomalies in Images. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- class anomalib.data.datamodules.image.mvtec_loco.MVTecLOCO(root='./datasets/MVTec_LOCO', category='breakfast_box', train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, test_split_mode=TestSplitMode.FROM_DIR, val_split_mode=ValSplitMode.FROM_DIR, test_split_ratio=None, val_split_ratio=None, seed=None)#
Bases:
AnomalibDataModuleMVTec LOCO Datamodule.
- Parameters:
root (Path | str) – Path to the root of the dataset. Defaults to
"./datasets/MVTec_LOCO".category (str) – Category of the MVTec LOCO dataset (e.g.
"breakfast_box"or"juice_bottle"). Defaults to"breakfast_box".train_batch_size (int, optional) – Training batch size. Defaults to
32.eval_batch_size (int, optional) – Test batch size. Defaults to
32.num_workers (int, optional) – Number of workers. Defaults to
8.train_augmentations (Transform | None) – Augmentations to apply to the training images Defaults to
None.val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to
None.test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to
None.augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided. Defaults to
None.test_split_mode (TestSplitMode | str) – Method to create test set. Defaults to
TestSplitMode.FROM_DIR.val_split_mode (ValSplitMode | str) – Method to create validation set. Defaults to
ValSplitMode.FROM_DIR.test_split_ratio (float | None) – Fraction of data to use for testing. Defaults to
None.val_split_ratio (float | None) – Fraction of data to use for validation. Defaults to
None.seed (int | None, optional) – Seed for reproducibility. Defaults to
None.
Example
Create MVTec LOCO datamodule with default settings:
>>> datamodule = MVTecLOCO() >>> datamodule.setup() >>> i, data = next(enumerate(datamodule.train_dataloader())) >>> data.keys() dict_keys(['image_path', 'label', 'image', 'mask_path', 'mask']) >>> data["image"].shape torch.Size([32, 3, 256, 256])
Change the category:
>>> datamodule = MVTecLOCO(category="juice_bottle")
Create validation set from test data:
>>> datamodule = MVTecLOCO( ... val_split_mode=ValSplitMode.FROM_TEST, ... val_split_ratio=0.1 ... )
Create synthetic validation set:
>>> datamodule = MVTecLOCO( ... val_split_mode=ValSplitMode.SYNTHETIC, ... val_split_ratio=0.2 ... )
See also
../../datasets/image/mvtecloco - MVTec LOCO Dataset