Folder Datamodule

Contents

Folder Datamodule#

Custom Folder Data Module.

This script creates a custom Lightning DataModule from a folder containing normal and abnormal images.

Example

Create a folder datamodule:

>>> from anomalib.data import Folder
>>> datamodule = Folder(
...     name="custom_folder",
...     root="./datasets/custom",
...     normal_dir="good",
...     abnormal_dir="defect"
... )

Notes

The directory structure should be organized as follows:

root/
├── normal_dir/
│   ├── image1.png
│   └── image2.png
├── abnormal_dir/
│   ├── image3.png
│   └── image4.png
└── mask_dir/
    ├── mask3.png
    └── mask4.png
class anomalib.data.datamodules.image.folder.Folder(name, normal_dir, root=None, abnormal_dir=None, normal_test_dir=None, mask_dir=None, normal_split_ratio=0.2, extensions=None, train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, test_split_mode=TestSplitMode.FROM_DIR, test_split_ratio=0.2, val_split_mode=ValSplitMode.FROM_TEST, val_split_ratio=0.5, seed=None)#

Bases: AnomalibDataModule

Folder DataModule.

Parameters:
  • name (str) – Name of the dataset. Used for logging/saving.

  • normal_dir (str | Path | Sequence) – Directory containing normal images.

  • root (str | Path | None) – Root folder containing normal and abnormal directories. Defaults to None.

  • abnormal_dir (str | Path | None | Sequence) – Directory containing abnormal images. Defaults to None.

  • normal_test_dir (str | Path | Sequence | None) – Directory containing normal test images. Defaults to None.

  • mask_dir (str | Path | Sequence | None) – Directory containing mask annotations. Defaults to None.

  • normal_split_ratio (float) – Ratio to split normal training images for test set when no normal test images exist. Defaults to 0.2.

  • extensions (tuple[str, ...] | None) – Image extensions to include. Defaults to None.

  • train_batch_size (int) – Training batch size. Defaults to 32.

  • eval_batch_size (int) – Validation/test batch size. Defaults to 32.

  • num_workers (int) – Number of workers for data loading. Defaults to 8.

  • train_augmentations (Transform | None) – Augmentations to apply dto the training images Defaults to None.

  • val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to None.

  • test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to None.

  • augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.

  • test_split_mode (TestSplitMode) – Method to obtain test subset. Defaults to TestSplitMode.FROM_DIR.

  • test_split_ratio (float) – Fraction of train images for testing. Defaults to 0.2.

  • val_split_mode (ValSplitMode) – Method to obtain validation subset. Defaults to ValSplitMode.FROM_TEST.

  • val_split_ratio (float) – Fraction of images for validation. Defaults to 0.5.

  • seed (int | None) – Random seed for splitting. Defaults to None.

Example

Create and setup a folder datamodule:

>>> from anomalib.data import Folder
>>> datamodule = Folder(
...     name="custom",
...     root="./datasets/custom",
...     normal_dir="good",
...     abnormal_dir="defect",
...     mask_dir="mask"
... )
>>> datamodule.setup()

Get a batch from train dataloader:

>>> batch = next(iter(datamodule.train_dataloader()))
>>> batch.keys()
dict_keys(['image', 'label', 'mask', 'image_path', 'mask_path'])

Get a batch from test dataloader:

>>> batch = next(iter(datamodule.test_dataloader()))
>>> batch.keys()
dict_keys(['image', 'label', 'mask', 'image_path', 'mask_path'])
property name: str#

Get name of the datamodule.

Returns:

Name of the datamodule.