MPDD#
MPDD Data Module.
This module provides a PyTorch Lightning DataModule for the MPDD dataset.
MPDD is a dataset aimed at benchmarking visual defect detection methods in industrial metal parts manufacturing. It contains 6 categories of industrial objects with both normal and anomalous samples. Each category includes RGB images and pixel-level ground truth masks for anomaly segmentation.
Example
Create a MPDD datamodule:
>>> from anomalib.data import MPDD
>>> datamodule = MPDD(
... root="./datasets/MPDD",
... category="bracket_black"
... )
Notes
The dataset should be downloaded manually from OneDrive and placed in the
appropriate directory. See DOWNLOAD_INSTRUCTIONS for detailed steps.
- License:
MPDD dataset is released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0) https://creativecommons.org/licenses/by-nc-sa/4.0/
- Reference:
S. Jezek, M. Jonak, R. Burget, P. Dvorak and M. Skotak (2021). Deep learning-based defect detection of metal parts: evaluating current methods in complex conditions. 13th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT), 2021, pp. 66-71, DOI: 10.1109/ICUMT54235.2021.9631567.
- class anomalib.data.datamodules.image.mpdd.MPDD(root='./datasets/MPDD', category='bracket_black', train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, test_split_mode=TestSplitMode.FROM_DIR, test_split_ratio=0.2, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#
Bases:
AnomalibDataModuleMPDD Datamodule.
- Parameters:
root (
Path|str|None) – Path to the root of the dataset. Defaults to"./datasets/MPDD".category (
str) – Category of the MPDD dataset (e.g."bracket_black"or"bracket_brown"). Defaults to"bracket_black".train_batch_size (
int) – Training batch size. Defaults to32.eval_batch_size (
int) – Test batch size. Defaults to32.num_workers (
int) – Number of workers. Defaults to8.train_augmentations (
Transform|None) – Augmentations to apply to the training images Defaults toNone.val_augmentations (
Transform|None) – Augmentations to apply to the validation images. Defaults toNone.test_augmentations (
Transform|None) – Augmentations to apply to the test images. Defaults toNone.augmentations (
Transform|None) – General augmentations to apply if stage-specific augmentations are not provided.test_split_mode (
TestSplitMode|str) – Method to create test set. Defaults toTestSplitMode.FROM_DIR.test_split_ratio (
float) – Fraction of data to use for testing. Defaults to0.2.val_split_mode (
ValSplitMode|str) – Method to create validation set. Defaults toValSplitMode.SAME_AS_TEST.val_split_ratio (
float) – Fraction of data to use for validation. Defaults to0.5.seed (
int|None) – Seed for reproducibility. Defaults toNone.
Example
Create MPDD datamodule with default settings:
>>> datamodule = MPDD() >>> datamodule.setup() >>> i, data = next(enumerate(datamodule.train_dataloader())) >>> data.keys() dict_keys(['image_path', 'label', 'image', 'mask_path', 'mask']) >>> data["image"].shape torch.Size([32, 3, 256, 256])
Change the category:
>>> datamodule = MPDD(category="bracket_brown")
Create validation set from test data:
>>> datamodule = MPDD( ... val_split_mode=ValSplitMode.FROM_TEST, ... val_split_ratio=0.1 ... )
Create synthetic validation set:
>>> datamodule = MPDD( ... val_split_mode=ValSplitMode.SYNTHETIC, ... val_split_ratio=0.2 ... )
- prepare_data()#
Verify that the dataset is available and provide download instructions.
This method checks if the dataset exists in the root directory. If not, it provides instructions for downloading from OneDrive.
The MPDD dataset is available at: https://vutbr-my.sharepoint.com/:f:/g/personal/xjezek16_vutbr_cz/EhHS_ufVigxDo3MC6Lweau0BVMuoCmhMZj6ddamiQ7-FnA?e=oHKCxI
- Return type: