Image Base Datamodule

Image Base Datamodule#

Base Anomalib data module.

This module provides the base data module class used across Anomalib. It handles dataset splitting, validation set creation, and dataloader configuration.

The module contains:

AnomalibDataModule: Base class for all Anomalib data modules

Example

Create a datamodule from a config file:

>>> from anomalib.data import AnomalibDataModule
>>> data_config = "examples/configs/data/mvtec.yaml"
>>> datamodule = AnomalibDataModule.from_config(config_path=data_config)

Override config with additional arguments:

>>> override_kwargs = {"data.train_batch_size": 8}
>>> datamodule = AnomalibDataModule.from_config(
...     config_path=data_config,
...     **override_kwargs
... )

class anomalib.data.datamodules.base.image.AnomalibDataModule(train_batch_size, eval_batch_size, num_workers, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, val_split_mode=None, val_split_ratio=None, test_split_mode=None, test_split_ratio=None, seed=None)#

Bases: LightningDataModule, ABC

Base Anomalib data module.

This class extends PyTorch Lightning’s LightningDataModule to provide common functionality for anomaly detection datasets.

Parameters:

train_batch_size (int) – Batch size used by the train dataloader.
eval_batch_size (int) – Batch size used by the val and test dataloaders.
num_workers (int) – Number of workers used by the train, val and test dataloaders.
train_augmentations (Transform | None) – Augmentations to apply to the training images Defaults to None.
val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to None.
test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to None.
augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
val_split_mode (ValSplitMode | str | None) –
Method to obtain validation set. Options:
- none: No validation set
- same_as_test: Use test set as validation
- from_test: Sample from test set
- synthetic: Generate synthetic anomalies
val_split_ratio (float | None) – Fraction of data to use for validation
test_split_mode (TestSplitMode | str | None) –
Method to obtain test set. Options:
- none: No test split
- from_dir: Use separate test directory
- synthetic: Generate synthetic anomalies
Defaults to None.
test_split_ratio (float | None) – Fraction of data to use for testing. Defaults to None.
seed (int | None) – Random seed for reproducible splitting. Defaults to None.

property category: str#

Get dataset category name.

Returns:: Name of the current category
Return type:: str

classmethod from_config(config_path, **kwargs)#

Create datamodule instance from config file.

Parameters:

config_path (str | Path) – Path to config file
**kwargs – Additional args to override config

Returns:

Instantiated datamodule

Return type:

AnomalibDataModule

Raises:

FileNotFoundError – If config file not found
ValueError – If instantiated object is not AnomalibDataModule

Example

Load from config file:

>>> config_path = "examples/configs/data/mvtec.yaml"
>>> datamodule = AnomalibDataModule.from_config(config_path)

Override config values:

>>> datamodule = AnomalibDataModule.from_config(
...     config_path,
...     data_train_batch_size=8
... )

property name: str#

Name of the datamodule.

Returns:: Class name of the datamodule
Return type:: str

predict_dataloader()#

Get prediction dataloader.

By default uses the test dataloader.

Returns:: Prediction dataloader
Return type:: Any

setup(stage=None)#

Set up train, validation and test data.

This method handles the data splitting logic based on the configured modes.

Parameters:: stage (str | None) – Current stage (fit/validate/test/predict). Defaults to None.
Return type:: None

property task: TaskType#

Get the task type.

Returns:: Type of anomaly task (classification/segmentation)
Return type:: TaskType
Raises:: AttributeError – If no datasets have been set up yet

test_dataloader()#

Get test dataloader.

Returns:: Test dataloader
Return type:: Any

train_dataloader()#

Get training dataloader.

Returns:: Training dataloader
Return type:: Any

val_dataloader()#

Get validation dataloader.

Returns:: Validation dataloader
Return type:: Any

Image Base Datamodule

Contents

Image Base Datamodule#