Video Datamodules#
Video datamodules in Anomalib are designed to handle video-based anomaly detection datasets. They provide a standardized interface for loading and processing video data for both training and inference.
Available Datamodules#
CUHK Avenue dataset for video anomaly detection.
ShanghaiTech dataset for video anomaly detection.
UCSD Pedestrian dataset for video anomaly detection.
API Reference#
Anomalib Datasets.
This module provides datasets and data modules for anomaly detection tasks.
- The module contains:
Data classes for representing different types of data (images, videos, etc.)
Dataset classes for loading and processing data
Data modules for use with PyTorch Lightning
Helper functions for data loading and validation
Example
>>> from anomalib.data import MVTecAD
>>> datamodule = MVTecAD(
... root="./datasets/MVTecAD",
... category="bottle",
... image_size=(256, 256)
... )
- class anomalib.data.Avenue(root='./datasets/avenue', gt_dir='./datasets/avenue/ground_truth_demo', clip_length_in_frames=2, frames_between_clips=1, target_frame=VideoTargetFrame.LAST, train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#
Bases:
AnomalibVideoDataModuleAvenue DataModule class.
- Parameters:
root (Path | str) – Path to the root of the dataset. Defaults to
"./datasets/avenue".gt_dir (Path | str) – Path to the ground truth files. Defaults to
"./datasets/avenue/ground_truth_demo".clip_length_in_frames (int) – Number of video frames in each clip. Defaults to
2.frames_between_clips (int) – Number of frames between consecutive clips. Defaults to
1.target_frame (VideoTargetFrame | str) – Target frame in clip for ground truth. Defaults to
VideoTargetFrame.LAST.train_batch_size (int) – Training batch size. Defaults to
32.eval_batch_size (int) – Test batch size. Defaults to
32.num_workers (int) – Number of workers. Defaults to
8.train_augmentations (Transform | None) – Augmentations to apply dto the training images Defaults to
None.val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to
None.test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to
None.augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
val_split_mode (ValSplitMode | str) – How validation subset is obtained. Defaults to
ValSplitMode.SAME_AS_TEST.val_split_ratio (float) – Fraction of data reserved for validation. Defaults to
0.5.seed (int | None) – Seed for reproducibility. Defaults to
None.
Example
Create a dataloader for classification:
>>> datamodule = Avenue( ... clip_length_in_frames=2, ... frames_between_clips=1, ... target_frame=VideoTargetFrame.LAST ... ) >>> datamodule.setup() >>> i, data = next(enumerate(datamodule.train_dataloader())) >>> data["image"].shape torch.Size([32, 2, 3, 256, 256])
Notes
The dataloader returns batches of clips, where each clip contains
clip_length_in_framesconsecutive frames.frames_between_clipsdetermines frame spacing between clips.target_framespecifies which frame provides ground truth.- prepare_data()#
Download the dataset if not available.
This method checks if the specified dataset is available in the file system. If not, it downloads and extracts the dataset into the appropriate directory.
- Return type:
Example
Assume the dataset is not available on the file system:
>>> datamodule = Avenue() >>> datamodule.prepare_data()
The directory structure after preparation will be:
datasets/ └── avenue/ ├── ground_truth_demo/ │ ├── ground_truth_show.m │ ├── Readme.txt │ ├── testing_label_mask/ │ └── testing_videos/ ├── testing_videos/ │ ├── ... │ └── 21.avi ├── testing_vol/ │ ├── ... │ └── vol21.mat ├── training_videos/ │ ├── ... │ └── 16.avi └── training_vol/ ├── ... └── vol16.mat
- class anomalib.data.ShanghaiTech(root='./datasets/shanghaitech', scene=1, clip_length_in_frames=2, frames_between_clips=1, target_frame=VideoTargetFrame.LAST, train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#
Bases:
AnomalibVideoDataModuleShanghaiTech DataModule class.
- Parameters:
root (Path | str) – Path to the root directory of the dataset. Defaults to
"./datasets/shanghaitech".scene (int) – Scene index in range [1, 13]. Defaults to
1.clip_length_in_frames (int) – Number of frames in each video clip. Defaults to
2.frames_between_clips (int) – Number of frames between consecutive clips. Defaults to
1.target_frame (VideoTargetFrame) – Specifies which frame in the clip should be used for ground truth. Defaults to
VideoTargetFrame.LAST.train_batch_size (int) – Training batch size. Defaults to
32.eval_batch_size (int) – Test batch size. Defaults to
32.num_workers (int) – Number of workers for data loading. Defaults to
8.train_augmentations (Transform | None) – Augmentations to apply dto the training images Defaults to
None.val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to
None.test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to
None.augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
val_split_mode (ValSplitMode) – Setting that determines how validation subset is obtained. Defaults to
ValSplitMode.SAME_AS_TEST.val_split_ratio (float) – Fraction of train or test images that will be reserved for validation. Defaults to
0.5.seed (int | None) – Random seed for reproducibility. Defaults to
None.
- class anomalib.data.UCSDped(root='./datasets/ucsd', category='UCSDped2', clip_length_in_frames=2, frames_between_clips=10, target_frame=VideoTargetFrame.LAST, train_batch_size=8, eval_batch_size=8, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#
Bases:
AnomalibVideoDataModuleUCSD Pedestrian DataModule Class.
- Parameters:
root (Path | str) – Path to the root directory where the dataset will be downloaded and extracted. Defaults to
"./datasets/ucsd".category (str) – Dataset subcategory. Must be either
"UCSDped1"or"UCSDped2". Defaults to"UCSDped2".clip_length_in_frames (int) – Number of frames in each video clip. Defaults to
2.frames_between_clips (int) – Number of frames between consecutive video clips. Defaults to
10.target_frame (VideoTargetFrame) – Specifies which frame in the clip should be used for ground truth. Defaults to
VideoTargetFrame.LAST.train_batch_size (int) – Batch size for training. Defaults to
8.eval_batch_size (int) – Batch size for validation and testing. Defaults to
8.num_workers (int) – Number of workers for data loading. Defaults to
8.train_augmentations (Transform | None) – Augmentations to apply dto the training images Defaults to
None.val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to
None.test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to
None.augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
val_split_mode (ValSplitMode) – Determines how validation set is created. Defaults to
ValSplitMode.SAME_AS_TEST.val_split_ratio (float) – Fraction of data to use for validation. Must be between 0 and 1. Defaults to
0.5.seed (int | None) – Random seed for reproducibility. Defaults to
None.
Example
>>> datamodule = UCSDped(root="./datasets/ucsd") >>> datamodule.setup() # Downloads and prepares the dataset >>> train_loader = datamodule.train_dataloader() >>> val_loader = datamodule.val_dataloader() >>> test_loader = datamodule.test_dataloader()