Video Datamodules

Video Datamodules#

Video datamodules in Anomalib are designed to handle video-based anomaly detection datasets. They provide a standardized interface for loading and processing video data for both training and inference.

Available Datamodules#

Avenue

CUHK Avenue dataset for video anomaly detection.

anomalib.data.Avenue

ShanghaiTech

ShanghaiTech dataset for video anomaly detection.

anomalib.data.ShanghaiTech

UCSDped

UCSD Pedestrian dataset for video anomaly detection.

anomalib.data.UCSDped

API Reference#

Anomalib Datasets.

This module provides datasets and data modules for anomaly detection tasks.

The module contains:

Data classes for representing different types of data (images, videos, etc.)
Dataset classes for loading and processing data
Data modules for use with PyTorch Lightning
Helper functions for data loading and validation

Example

>>> from anomalib.data import MVTecAD
>>> datamodule = MVTecAD(
...     root="./datasets/MVTecAD",
...     category="bottle",
...     image_size=(256, 256)
... )

class anomalib.data.Avenue(root='./datasets/avenue', gt_dir='./datasets/avenue/ground_truth_demo', clip_length_in_frames=2, frames_between_clips=1, target_frame=VideoTargetFrame.LAST, train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#

Bases: AnomalibVideoDataModule

Avenue DataModule class.

Parameters:

root (Path | str) – Path to the root of the dataset. Defaults to "./datasets/avenue".
gt_dir (Path | str) – Path to the ground truth files. Defaults to "./datasets/avenue/ground_truth_demo".
clip_length_in_frames (int) – Number of video frames in each clip. Defaults to 2.
frames_between_clips (int) – Number of frames between consecutive clips. Defaults to 1.
target_frame (VideoTargetFrame | str) – Target frame in clip for ground truth. Defaults to VideoTargetFrame.LAST.
train_batch_size (int) – Training batch size. Defaults to 32.
eval_batch_size (int) – Test batch size. Defaults to 32.
num_workers (int) – Number of workers. Defaults to 8.
train_augmentations (Transform | None) – Augmentations to apply dto the training images Defaults to None.
val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to None.
test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to None.
augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
val_split_mode (ValSplitMode | str) – How validation subset is obtained. Defaults to ValSplitMode.SAME_AS_TEST.
val_split_ratio (float) – Fraction of data reserved for validation. Defaults to 0.5.
seed (int | None) – Seed for reproducibility. Defaults to None.

Example

Create a dataloader for classification:

>>> datamodule = Avenue(
...     clip_length_in_frames=2,
...     frames_between_clips=1,
...     target_frame=VideoTargetFrame.LAST
... )
>>> datamodule.setup()
>>> i, data = next(enumerate(datamodule.train_dataloader()))
>>> data["image"].shape
torch.Size([32, 2, 3, 256, 256])

Notes

The dataloader returns batches of clips, where each clip contains clip_length_in_frames consecutive frames. frames_between_clips determines frame spacing between clips. target_frame specifies which frame provides ground truth.

prepare_data()#

Download the dataset if not available.

This method checks if the specified dataset is available in the file system. If not, it downloads and extracts the dataset into the appropriate directory.

Return type:: None

Example

Assume the dataset is not available on the file system:

>>> datamodule = Avenue()
>>> datamodule.prepare_data()

The directory structure after preparation will be:

datasets/
└── avenue/
    ├── ground_truth_demo/
    │   ├── ground_truth_show.m
    │   ├── Readme.txt
    │   ├── testing_label_mask/
    │   └── testing_videos/
    ├── testing_videos/
    │   ├── ...
    │   └── 21.avi
    ├── testing_vol/
    │   ├── ...
    │   └── vol21.mat
    ├── training_videos/
    │   ├── ...
    │   └── 16.avi
    └── training_vol/
        ├── ...
        └── vol16.mat

class anomalib.data.ShanghaiTech(root='./datasets/shanghaitech', scene=1, clip_length_in_frames=2, frames_between_clips=1, target_frame=VideoTargetFrame.LAST, train_batch_size=32, eval_batch_size=32, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#

Bases: AnomalibVideoDataModule

ShanghaiTech DataModule class.

Parameters:

root (Path | str) – Path to the root directory of the dataset. Defaults to "./datasets/shanghaitech".
scene (int) – Scene index in range [1, 13]. Defaults to 1.
clip_length_in_frames (int) – Number of frames in each video clip. Defaults to 2.
frames_between_clips (int) – Number of frames between consecutive clips. Defaults to 1.
target_frame (VideoTargetFrame) – Specifies which frame in the clip should be used for ground truth. Defaults to VideoTargetFrame.LAST.
train_batch_size (int) – Training batch size. Defaults to 32.
eval_batch_size (int) – Test batch size. Defaults to 32.
num_workers (int) – Number of workers for data loading. Defaults to 8.
train_augmentations (Transform | None) – Augmentations to apply dto the training images Defaults to None.
val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to None.
test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to None.
augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
val_split_mode (ValSplitMode) – Setting that determines how validation subset is obtained. Defaults to ValSplitMode.SAME_AS_TEST.
val_split_ratio (float) – Fraction of train or test images that will be reserved for validation. Defaults to 0.5.
seed (int | None) – Random seed for reproducibility. Defaults to None.

prepare_data()#

Download the dataset and convert video files.

Return type:: None

class anomalib.data.UCSDped(root='./datasets/ucsd', category='UCSDped2', clip_length_in_frames=2, frames_between_clips=10, target_frame=VideoTargetFrame.LAST, train_batch_size=8, eval_batch_size=8, num_workers=8, train_augmentations=None, val_augmentations=None, test_augmentations=None, augmentations=None, val_split_mode=ValSplitMode.SAME_AS_TEST, val_split_ratio=0.5, seed=None)#

Bases: AnomalibVideoDataModule

UCSD Pedestrian DataModule Class.

Parameters:

root (Path | str) – Path to the root directory where the dataset will be downloaded and extracted. Defaults to "./datasets/ucsd".
category (str) – Dataset subcategory. Must be either "UCSDped1" or "UCSDped2". Defaults to "UCSDped2".
clip_length_in_frames (int) – Number of frames in each video clip. Defaults to 2.
frames_between_clips (int) – Number of frames between consecutive video clips. Defaults to 10.
target_frame (VideoTargetFrame) – Specifies which frame in the clip should be used for ground truth. Defaults to VideoTargetFrame.LAST.
train_batch_size (int) – Batch size for training. Defaults to 8.
eval_batch_size (int) – Batch size for validation and testing. Defaults to 8.
num_workers (int) – Number of workers for data loading. Defaults to 8.
train_augmentations (Transform | None) – Augmentations to apply dto the training images Defaults to None.
val_augmentations (Transform | None) – Augmentations to apply to the validation images. Defaults to None.
test_augmentations (Transform | None) – Augmentations to apply to the test images. Defaults to None.
augmentations (Transform | None) – General augmentations to apply if stage-specific augmentations are not provided.
val_split_mode (ValSplitMode) – Determines how validation set is created. Defaults to ValSplitMode.SAME_AS_TEST.
val_split_ratio (float) – Fraction of data to use for validation. Must be between 0 and 1. Defaults to 0.5.
seed (int | None) – Random seed for reproducibility. Defaults to None.

Example

>>> datamodule = UCSDped(root="./datasets/ucsd")
>>> datamodule.setup()  # Downloads and prepares the dataset
>>> train_loader = datamodule.train_dataloader()
>>> val_loader = datamodule.val_dataloader()
>>> test_loader = datamodule.test_dataloader()

prepare_data()#

Download and extract the dataset if not already available.

The method checks if the dataset directory exists. If not, it downloads and extracts the dataset to the specified root directory.

Return type:: None

Video Datamodules

Contents

Video Datamodules#

Available Datamodules#

API Reference#