Visa Datamodule#
Visual Anomaly (VisA) Data Module.
This module provides a PyTorch Lightning DataModule for the Visual Anomaly (VisA) dataset. If the dataset is not available locally, it will be downloaded and extracted automatically.
Example
Create a VisA datamodule:
>>> from anomalib.data import Visa
>>> datamodule = Visa(
... root="./datasets/visa",
... category="capsules"
... )
Notes
The dataset will be automatically downloaded and converted to the required format when first used. The directory structure after preparation will be:
datasets/
└── visa/
├── visa_pytorch/
│ ├── candle/
│ ├── capsules/
│ └── ...
└── VisA_20220922.tar
- License:
The VisA dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). https://creativecommons.org/licenses/by-nc-sa/4.0/
- Reference:
Zou, Y., Jeong, J., Pemula, L., Zhang, D., & Dabeer, O. (2022). SPot-the-Difference Self-supervised Pre-training for Anomaly Detection and Segmentation. In European Conference on Computer Vision (pp. 392-408). Springer, Cham.
- class anomalib.data.datamodules.image.visa.Visa(root='./datasets/visa', category='capsules', train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, test_split_mode=TestSplitMode.FROM_DIR, test_split_ratio=0.2, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#
Bases:
AnomalibDataModuleVisA Datamodule.
- Parameters:
root (
Path|str|None) – Path to the root of the dataset. Defaults to"./datasets/visa".category (
str) – Category of the VisA dataset (e.g."candle"). Defaults to"capsules".train_batch_size (
int) – Training batch size. Defaults to32.eval_batch_size (
int) – Test batch size. Defaults to32.num_workers (
int) – Number of workers for data loading. Defaults to8.train_augmentations (
Transform|None) – Augmentations to apply to the training images Defaults toNone.val_augmentations (
Transform|None) – Augmentations to apply to the validation images. Defaults toNone.test_augmentations (
Transform|None) – Augmentations to apply to the test images. Defaults toNone.augmentations (
Transform|None) – General augmentations to apply if stage-specific augmentations are not provided.test_split_mode (
TestSplitMode|str) – Method to create test set. Defaults toTestSplitMode.FROM_DIR.test_split_ratio (
float) – Fraction of data to use for testing. Defaults to0.2.val_split_mode (
ValSplitMode|str) – Method to create validation set. Defaults toValSplitMode.SAME_AS_TEST.val_split_ratio (
float) – Fraction of data to use for validation. Defaults to0.5.seed (
int|None) – Random seed for reproducibility. Defaults toNone.
- apply_cls1_split()#
Apply the 1-class subset splitting using the fixed split in the csv file.
Adapted from amazon-science/spot-diff.
- Return type:
- prepare_data()#
Download and prepare the dataset if not available.
This method checks if the dataset exists and is properly formatted. If not, it downloads and prepares the data in the following steps:
If the processed dataset exists (
visa_pytorch/{category}), do nothingIf the raw dataset exists but isn’t processed, apply the train/test split
If the dataset doesn’t exist, download, extract, and process it
The final directory structure will be:
datasets/ └── visa/ ├── visa_pytorch/ │ ├── candle/ │ │ ├── train/ │ │ │ └── good/ │ │ ├── test/ │ │ │ ├── good/ │ │ │ └── bad/ │ │ └── ground_truth/ │ │ └── bad/ │ └── ... └── VisA_20220922.tar- Return type:
See also
../../datasets/image/visa - VisA Dataset