Image Base Datamodule#
Base Anomalib data module.
This module provides the base data module class used across Anomalib. It handles dataset splitting, validation set creation, and dataloader configuration.
- The module contains:
AnomalibDataModule: Base class for all Anomalib data modules
Example
Create a datamodule from a config file:
>>> from anomalib.data import AnomalibDataModule
>>> data_config = "examples/configs/data/mvtec.yaml"
>>> datamodule = AnomalibDataModule.from_config(config_path=data_config)
Override config with additional arguments:
>>> override_kwargs = {"data.train_batch_size": 8}
>>> datamodule = AnomalibDataModule.from_config(
... config_path=data_config,
... **override_kwargs
... )
- class anomalib.data.datamodules.base.image.AnomalibDataModule(train_batch_size, eval_batch_size, num_workers, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, val_split_mode=None, val_split_ratio=None, test_split_mode=None, test_split_ratio=None, seed=None)#
Bases:
LightningDataModule,ABCBase Anomalib data module.
This class extends PyTorch Lightning’s
LightningDataModuleto provide common functionality for anomaly detection datasets.- Parameters:
train_batch_size (
int) – Batch size used by the train dataloader.eval_batch_size (
int) – Batch size used by the val and test dataloaders.num_workers (
int) – Number of workers used by the train, val and test dataloaders.train_augmentations (
Transform|None) – Augmentations to apply to the training images Defaults toNone.val_augmentations (
Transform|None) – Augmentations to apply to the validation images. Defaults toNone.test_augmentations (
Transform|None) – Augmentations to apply to the test images. Defaults toNone.augmentations (
Transform|None) – General augmentations to apply if stage-specific augmentations are not provided.val_split_mode (
ValSplitMode|str|None) –Method to obtain validation set. Options:
none: No validation setsame_as_test: Use test set as validationfrom_test: Sample from test setsynthetic: Generate synthetic anomalies
val_split_ratio (
float|None) – Fraction of data to use for validationtest_split_mode (
TestSplitMode|str|None) –Method to obtain test set. Options:
none: No test splitfrom_dir: Use separate test directorysynthetic: Generate synthetic anomalies
Defaults to
None.test_split_ratio (
float|None) – Fraction of data to use for testing. Defaults toNone.seed (
int|None) – Random seed for reproducible splitting. Defaults toNone.
- property category: str#
Get dataset category name.
- Returns:
Name of the current category
- Return type:
- classmethod from_config(config_path, **kwargs)#
Create datamodule instance from config file.
- Parameters:
- Returns:
Instantiated datamodule
- Return type:
- Raises:
FileNotFoundError – If config file not found
ValueError – If instantiated object is not AnomalibDataModule
Example
Load from config file:
>>> config_path = "examples/configs/data/mvtec.yaml" >>> datamodule = AnomalibDataModule.from_config(config_path)
Override config values:
>>> datamodule = AnomalibDataModule.from_config( ... config_path, ... data_train_batch_size=8 ... )
- predict_dataloader()#
Get prediction dataloader.
By default uses the test dataloader.
- Returns:
Prediction dataloader
- Return type:
- setup(stage=None)#
Set up train, validation and test data.
This method handles the data splitting logic based on the configured modes.
- property task: TaskType#
Get the task type.
- Returns:
Type of anomaly task (classification/segmentation)
- Return type:
TaskType
- Raises:
AttributeError – If no datasets have been set up yet