Visa Datamodule#
Visual Anomaly (VisA) Data Module.
This module provides a PyTorch Lightning DataModule for the Visual Anomaly (VisA) dataset. If the dataset is not available locally, it will be downloaded and extracted automatically.
Example
Create a VisA datamodule:
>>> from anomalib.data import Visa
>>> datamodule = Visa(
... root="./datasets/visa",
... category="capsules"
... )
Notes
The dataset will be automatically downloaded and converted to the required format when first used. The directory structure after preparation will be:
datasets/
└── visa/
├── visa_pytorch/
│ ├── candle/
│ ├── capsules/
│ └── ...
└── VisA_20220922.tar
- License:
The VisA dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). https://creativecommons.org/licenses/by-nc-sa/4.0/
- Reference:
Zou, Y., Jeong, J., Pemula, L., Zhang, D., & Dabeer, O. (2022). SPot-the-Difference Self-supervised Pre-training for Anomaly Detection and Segmentation. In European Conference on Computer Vision (pp. 392-408). Springer, Cham.
- class anomalib.data.datamodules.image.visa.Visa(root='./datasets/visa', category='capsules', train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, test_split_mode=TestSplitMode.FROM_DIR, test_split_ratio=0.2, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#
Bases:
AnomalibDataModuleVisA Datamodule.
- Parameters:
root (Path | str) – Path to the root of the dataset. Defaults to
"./datasets/visa".category (str) – Category of the VisA dataset (e.g.
"candle"). Defaults to"capsules".train_batch_size (int, optional) – Training batch size. Defaults to
32.eval_batch_size (int, optional) – Test batch size. Defaults to
32.num_workers (int, optional) – Number of workers for data loading. Defaults to
8.train_augmentations (Transform | None) – Augmentations to apply dto the training images Defaults to
None.val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to
None.test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to
None.augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
test_split_mode (TestSplitMode | str) – Method to create test set. Defaults to
TestSplitMode.FROM_DIR.test_split_ratio (float) – Fraction of data to use for testing. Defaults to
0.2.val_split_mode (ValSplitMode | str) – Method to create validation set. Defaults to
ValSplitMode.SAME_AS_TEST.val_split_ratio (float) – Fraction of data to use for validation. Defaults to
0.5.seed (int | None, optional) – Random seed for reproducibility. Defaults to
None.
- apply_cls1_split()#
Apply the 1-class subset splitting using the fixed split in the csv file.
Adapted from amazon-science/spot-diff.
- Return type:
- prepare_data()#
Download and prepare the dataset if not available.
This method checks if the dataset exists and is properly formatted. If not, it downloads and prepares the data in the following steps: :rtype:
NoneIf the processed dataset exists (
visa_pytorch/{category}), do nothingIf the raw dataset exists but isn’t processed, apply the train/test split
If the dataset doesn’t exist, download, extract, and process it
The final directory structure will be:
datasets/ └── visa/ ├── visa_pytorch/ │ ├── candle/ │ │ ├── train/ │ │ │ └── good/ │ │ ├── test/ │ │ │ ├── good/ │ │ │ └── bad/ │ │ └── ground_truth/ │ │ └── bad/ │ └── ... └── VisA_20220922.tar
See also
../../datasets/image/visa - VisA Dataset