RealIAD Datamodule#
Real-IAD Data Module.
This module provides a PyTorch Lightning DataModule for the Real-IAD dataset.
The Real-IAD dataset is a large-scale industrial anomaly detection dataset containing 30 categories of industrial objects with both normal and anomalous samples. Each object is captured from 5 different camera viewpoints (C1-C5).
- Dataset Structure:
- The dataset follows this directory structure:
Real-IAD/ ├── realiad_256/ # 256x256 resolution images ├── realiad_512/ # 512x512 resolution images ├── realiad_1024/ # 1024x1024 resolution images └── realiad_jsons/ # JSON metadata files
├── realiad_jsons/ # Base metadata ├── realiad_jsons_sv/ # Single-view metadata └── realiad_jsons_fuiad/ # FUIAD metadata versions
Example
Create a Real-IAD datamodule:
>>> from anomalib.data import RealIAD
>>> datamodule = RealIAD(
... root="./datasets/Real-IAD",
... category="audiojack",
... resolution="1024"
... )
Notes
The dataset should be downloaded manually from Hugging Face and placed in the
appropriate directory. See DOWNLOAD_INSTRUCTIONS for detailed steps.
- License:
Real-IAD dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). https://creativecommons.org/licenses/by-nc-sa/4.0/
- class anomalib.data.datamodules.image.realiad.RealIAD(root='./datasets/Real-IAD', category='audiojack', resolution=256, json_path='realiad_jsons/realiad_jsons/{category}.json', train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, test_split_mode=TestSplitMode.NONE, val_split_mode=ValSplitMode.SAME_AS_TEST, seed=None)#
Bases:
AnomalibDataModuleReal-IAD Datamodule.
- Parameters:
root (Path | str) – Path to root directory containing the dataset. Defaults to
"./datasets/Real-IAD".category (str) – Category of the Real-IAD dataset (e.g.
"audiojack"or"button_battery"). Defaults to"audiojack".resolution (str | int) – Image resolution to use (e.g.
"256","512","1024","raw"or their integer equivalents). For example, both “256” and 256 are valid. Defaults to256.json_path (str | Path) – Path to JSON metadata file, relative to root directory. Can use {category} placeholder which will be replaced with the category name. Common paths are: - “realiad_jsons/realiad_jsons/{category}.json” - Base metadata (multi-view) - “realiad_jsons/realiad_jsons_sv/{category}.json” - Single-view metadata - “realiad_jsons/realiad_jsons_fuiad_0.4/{category}.json” - FUIAD v0.4 metadata
train_batch_size (int, optional) – Training batch size. Defaults to
32.eval_batch_size (int, optional) – Test batch size. Defaults to
32.num_workers (int, optional) – Number of workers. Defaults to
8.train_augmentations (Transform | None) – Augmentations to apply to the training images Defaults to
None.val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to
None.test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to
None.augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
test_split_mode (TestSplitMode) – Method to create test set. Defaults to
TestSplitMode.NONE.val_split_mode (ValSplitMode) – Method to create validation set. Defaults to
ValSplitMode.SAME_AS_TEST.seed (int | None, optional) – Seed for reproducibility. Defaults to
None.
Example
Create Real-IAD datamodule with default settings:
>>> datamodule = RealIAD() >>> datamodule.setup() >>> i, data = next(enumerate(datamodule.train_dataloader())) >>> data.keys() dict_keys(['image_path', 'label', 'image', 'mask_path', 'mask']) >>> data["image"].shape torch.Size([32, 3, 256, 256])
Change the category and resolution:
>>> # Using string resolution >>> datamodule = RealIAD( ... category="button_battery", ... resolution="512" ... ) >>> # Using integer resolution >>> datamodule = RealIAD( ... category="button_battery", ... resolution=1024 ... )
Use different JSON metadata files:
>>> # Base metadata (multi-view) >>> datamodule = RealIAD( ... json_path="realiad_jsons/realiad_jsons/{category}.json" ... ) >>> # Single-view metadata >>> datamodule = RealIAD( ... json_path="realiad_jsons/realiad_jsons_sv/{category}.json" ... ) >>> # FUIAD v0.4 metadata (filtered subset) >>> datamodule = RealIAD( ... json_path="realiad_jsons/realiad_jsons_fuiad_0.4/{category}.json" ... ) >>> # Custom metadata >>> datamodule = RealIAD( ... json_path="path/to/custom/metadata.json" ... )
Create validation set from test data:
>>> datamodule = RealIAD( ... val_split_mode=ValSplitMode.FROM_TEST, ... val_split_ratio=0.1 ... )
Notes
The dataset contains both normal (OK) and anomalous (NG) samples
Each object is captured from 5 different camera viewpoints (C1-C5)
Images are available in multiple resolutions (256x256, 512x512, 1024x1024)
JSON metadata files provide additional information and different dataset splits
Segmentation masks are provided for anomalous samples
- prepare_data()#
Verify that the dataset is available and provide download instructions.
This method checks if the dataset exists in the root directory. If not, it provides instructions for requesting access and downloading from Hugging Face.
The Real-IAD dataset is available at: https://huggingface.co/datasets/REAL-IAD/Real-IAD :rtype:
NoneNote
The dataset requires approval from the authors. You need to: 1. Create a Hugging Face account 2. Request access to the dataset 3. Wait for approval 4. Download and extract to the root directory
- anomalib.data.datamodules.image.realiad.get_download_instructions(root_path)#
Get download instructions for the Real-IAD dataset.
See also
../../datasets/image/realiad - Real-IAD Dataset