Torch Dataclasses#

The torch dataclasses module provides PyTorch-based implementations of the generic dataclasses used in Anomalib. These classes are designed to work with PyTorch tensors for efficient data handling and processing in anomaly detection tasks.


The module includes several categories of dataclasses:

  • Base Classes: Generic PyTorch-based data structures

  • Image Classes: Specialized for image data processing

  • Video Classes: Designed for video data handling

  • Depth Classes: Specific to depth-based anomaly detection

Base Classes#


class, gt_label=None, gt_mask=None, mask_path=None, anomaly_map=None, pred_score=None, pred_mask=None, pred_label=None, explanation=None)#

Bases: Generic[ImageT], _GenericItem[Tensor, ImageT, Mask, str]

Base dataclass for individual items in Anomalib datasets using PyTorch.

This class extends the generic _GenericItem class to provide a PyTorch-specific implementation for single data items in Anomalib datasets. It handles various types of data (e.g., images, labels, masks) represented as PyTorch tensors.

The class uses generic types to allow flexibility in the image representation, which can vary depending on the specific use case (e.g., standard images, video clips).


This class is typically subclassed to create more specific item types (e.g., ImageItem, VideoItem) with additional fields and methods.


class, gt_label=None, gt_mask=None, mask_path=None, anomaly_map=None, pred_score=None, pred_mask=None, pred_label=None, explanation=None)#

Bases: Generic[ImageT], _GenericBatch[Tensor, ImageT, Mask, list[str]]

Base dataclass for batches of items in Anomalib datasets using PyTorch.

This class extends the generic _GenericBatch class to provide a PyTorch-specific implementation for batches of data in Anomalib datasets. It handles collections of data items (e.g., multiple images, labels, masks) represented as PyTorch tensors.

The class uses generic types to allow flexibility in the image representation, which can vary depending on the specific use case (e.g., standard images, video clips).


This class is typically subclassed to create more specific batch types (e.g., ImageBatch, VideoBatch) with additional fields and methods.


class Tensor | None = None, pred_label: Tensor | None = None, anomaly_map: Tensor | None = None, pred_mask: Tensor | None = None)#

Bases: NamedTuple

Batch for use in torch and inference models.

  • pred_score (torch.Tensor | None) – Predicted anomaly scores. Defaults to None.

  • pred_label (torch.Tensor | None) – Predicted anomaly labels. Defaults to None.

  • anomaly_map (torch.Tensor | None) – Generated anomaly maps. Defaults to None.

  • pred_mask (torch.Tensor | None) – Predicted anomaly masks. Defaults to None.

anomaly_map: Tensor | None#

Alias for field number 2

pred_label: Tensor | None#

Alias for field number 1

pred_mask: Tensor | None#

Alias for field number 3

pred_score: Tensor | None#

Alias for field number 0



Bases: Generic[NumpyT]

Mixin for converting torch-based dataclasses to numpy.

This mixin provides functionality to convert PyTorch tensor data to numpy arrays. It requires the subclass to define a numpy_class attribute specifying the corresponding numpy-based class.


>>> from anomalib.dataclasses.numpy import NumpyImageItem
>>> @dataclass
... class TorchImageItem(ToNumpyMixin[NumpyImageItem]):
...     numpy_class = NumpyImageItem
...     image: torch.Tensor
...     gt_label: torch.Tensor
>>> torch_item = TorchImageItem(
...     image=torch.rand(3, 224, 224),
...     gt_label=torch.tensor(1)
... )
>>> numpy_item = torch_item.to_numpy()
>>> isinstance(numpy_item, NumpyImageItem)

Convert the batch to a NumpyBatch object.


The converted numpy batch object.

Return type:


Image Classes#


class, gt_label=None, gt_mask=None, mask_path=None, anomaly_map=None, pred_score=None, pred_mask=None, pred_label=None, explanation=None, image_path=None)#

Bases: ToNumpyMixin[NumpyImageItem], ImageValidator, _ImageInputFields[str], DatasetItem[Image]

Dataclass for individual image items in Anomalib datasets using PyTorch.

This class combines _ImageInputFields and DatasetItem for image-based anomaly detection. It includes image-specific fields and validation methods to ensure proper formatting for Anomalib’s image-based models.

The class uses the following type parameters:


>>> import torch
>>> from import ImageItem
>>> item = ImageItem(
...     image=torch.rand(3, 224, 224),
...     gt_label=torch.tensor(0),
...     image_path="path/to/image.jpg"
... )
>>> item.image.shape
torch.Size([3, 224, 224])

Convert to numpy format: >>> numpy_item = item.to_numpy() >>> type(numpy_item).__name__ ‘NumpyImageItem’


alias of NumpyImageItem


class, gt_label=None, gt_mask=None, mask_path=None, anomaly_map=None, pred_score=None, pred_mask=None, pred_label=None, explanation=None, image_path=None)#

Bases: ToNumpyMixin[NumpyImageBatch], BatchIterateMixin[ImageItem], ImageBatchValidator, _ImageInputFields[list[str]], Batch[Image]

Dataclass for batches of image items in Anomalib datasets using PyTorch.

This class combines _ImageInputFields and Batch for batches of image data. It includes image-specific fields and methods for batch operations and iteration.

The class uses the following type parameters:

Where B represents the batch dimension.


>>> import torch
>>> from import ImageBatch
>>> batch = ImageBatch(
...     image=torch.rand(32, 3, 224, 224),
...     gt_label=torch.randint(0, 2, (32,)),
...     image_path=[f"path/to/image_{i}.jpg" for i in range(32)]
... )
>>> batch.image.shape
torch.Size([32, 3, 224, 224])

Iterate over batch: >>> for item in batch: … assert item.image.shape == torch.Size([3, 224, 224])

Convert to numpy format: >>> numpy_batch = batch.to_numpy() >>> type(numpy_batch).__name__ ‘NumpyImageBatch’


alias of ImageItem


alias of NumpyImageBatch

Video Classes#


class, gt_label=None, gt_mask=None, mask_path=None, anomaly_map=None, pred_score=None, pred_mask=None, pred_label=None, explanation=None, original_image=None, video_path=None, target_frame=None, frames=None, last_frame=None)#

Bases: ToNumpyMixin[NumpyVideoItem], VideoValidator, _VideoInputFields[Tensor, Video, Mask, str], DatasetItem[Video]

Dataclass for individual video items in Anomalib datasets using PyTorch.

This class combines _VideoInputFields and DatasetItem for video-based anomaly detection. It includes video-specific fields and validation methods to ensure proper formatting for Anomalib’s video-based models.

The class uses the following type parameters:

Where T represents the temporal dimension (number of frames).


>>> import torch
>>> from import VideoItem
>>> item = VideoItem(
...     image=torch.rand(10, 3, 224, 224),  # 10 frames
...     gt_label=torch.tensor(0),
...     video_path="path/to/video.mp4"
... )
>>> item.image.shape
torch.Size([10, 3, 224, 224])

Convert to numpy format: >>> numpy_item = item.to_numpy() >>> type(numpy_item).__name__ ‘NumpyVideoItem’


alias of NumpyVideoItem


Convert the video item to an image item.

Return type:



class, gt_label=None, gt_mask=None, mask_path=None, anomaly_map=None, pred_score=None, pred_mask=None, pred_label=None, explanation=None, original_image=None, video_path=None, target_frame=None, frames=None, last_frame=None)#

Bases: ToNumpyMixin[NumpyVideoBatch], BatchIterateMixin[VideoItem], VideoBatchValidator, _VideoInputFields[Tensor, Video, Mask, list[str]], Batch[Video]

Dataclass for batches of video items in Anomalib datasets using PyTorch.

This class represents batches of video data for batch processing in anomaly detection tasks. It combines functionality from multiple mixins to handle batched video data efficiently.

The class uses the following type parameters:

Where B represents the batch dimension and T the temporal dimension.


>>> import torch
>>> from import VideoBatch
>>> batch = VideoBatch(
...     image=torch.rand(32, 10, 3, 224, 224),  # 32 videos, 10 frames
...     gt_label=torch.randint(0, 2, (32,)),
...     video_path=["video_{}.mp4".format(i) for i in range(32)]
... )
>>> batch.image.shape
torch.Size([32, 10, 3, 224, 224])

Iterate over items in batch: >>> next(iter(batch)).image.shape torch.Size([10, 3, 224, 224])

Convert to numpy format: >>> numpy_batch = batch.to_numpy() >>> type(numpy_batch).__name__ ‘NumpyVideoBatch’


alias of VideoItem


alias of NumpyVideoBatch

Depth Classes#


class, gt_label=None, gt_mask=None, mask_path=None, anomaly_map=None, pred_score=None, pred_mask=None, pred_label=None, explanation=None, image_path=None, depth_map=None, depth_path=None)#

Bases: ToNumpyMixin[NumpyImageItem], DepthValidator, _DepthInputFields[Tensor, str], DatasetItem[Image]

Dataclass for individual depth items in Anomalib datasets using PyTorch.

This class represents a single depth item in Anomalib datasets using PyTorch tensors. It combines the functionality of ToNumpyMixin, _DepthInputFields, and DatasetItem to handle depth data, including depth maps, labels, and metadata.

  • image (torch.Tensor) – Image tensor of shape (C, H, W).

  • gt_label (torch.Tensor) – Ground truth label tensor.

  • depth_map (torch.Tensor) – Depth map tensor of shape (H, W).

  • image_path (str) – Path to the source image file.

  • depth_path (str) – Path to the depth map file.


>>> import torch
>>> item = DepthItem(
...     image=torch.rand(3, 224, 224),
...     gt_label=torch.tensor(1),
...     depth_map=torch.rand(224, 224),
...     image_path="path/to/image.jpg",
...     depth_path="path/to/depth.png"
... )
>>> print(item.image.shape, item.depth_map.shape)
torch.Size([3, 224, 224]) torch.Size([224, 224])

alias of NumpyImageItem


class, gt_label=None, gt_mask=None, mask_path=None, anomaly_map=None, pred_score=None, pred_mask=None, pred_label=None, explanation=None, image_path=None, depth_map=None, depth_path=None)#

Bases: BatchIterateMixin[DepthItem], DepthBatchValidator, _DepthInputFields[Tensor, list[str]], Batch[Image]

Dataclass for batches of depth items in Anomalib datasets using PyTorch.

This class represents a batch of depth items in Anomalib datasets using PyTorch tensors. It combines the functionality of BatchIterateMixin, _DepthInputFields, and Batch to handle batches of depth data, including depth maps, labels, and metadata.

  • image (torch.Tensor) – Batch of images of shape (B, C, H, W).

  • gt_label (torch.Tensor) – Batch of ground truth labels of shape (B,).

  • depth_map (torch.Tensor) – Batch of depth maps of shape (B, H, W).

  • image_path (list[str]) – List of paths to the source image files.

  • depth_path (list[str]) – List of paths to the depth map files.


>>> import torch
>>> batch = DepthBatch(
...     image=torch.rand(32, 3, 224, 224),
...     gt_label=torch.randint(0, 2, (32,)),
...     depth_map=torch.rand(32, 224, 224),
...     image_path=["path/to/image_{}.jpg".format(i) for i in range(32)],
...     depth_path=["path/to/depth_{}.png".format(i) for i in range(32)]
... )
>>> print(batch.image.shape, batch.depth_map.shape)
torch.Size([32, 3, 224, 224]) torch.Size([32, 224, 224])
>>> for item in batch:
...     print(item.image.shape, item.depth_map.shape)
torch.Size([3, 224, 224]) torch.Size([224, 224])

alias of DepthItem

