Synthetic Data Utils#

Utilities to generate synthetic data.

This module provides utilities for generating synthetic data for anomaly detection. The utilities include:

  • Perlin noise generation: Functions for creating Perlin noise patterns

  • Anomaly generation: Classes for generating synthetic anomalies

Example

>>> from anomalib.data.utils.generators import generate_perlin_noise
>>> # Generate 256x256 Perlin noise
>>> noise = generate_perlin_noise(256, 256)
>>> print(noise.shape)
torch.Size([256, 256])
>>> from anomalib.data.utils.generators import PerlinAnomalyGenerator
>>> # Create anomaly generator
>>> generator = PerlinAnomalyGenerator()
>>> # Generate anomaly mask
>>> mask = generator.generate(256, 256)
class anomalib.data.utils.generators.PerlinAnomalyGenerator(anomaly_source_path=None, probability=0.5, blend_factor=(0.2, 1.0), rotation_range=(-90, 90))#

Bases: Transform

Perlin noise-based synthetic anomaly generator.

This class provides functionality to generate synthetic anomalies using Perlin noise patterns. It can also use real anomaly source images for more realistic anomaly generation.

Parameters:
  • anomaly_source_path (str | None) – Optional path to directory containing anomaly source images. If provided, these images will be used instead of Perlin noise patterns.

  • probability (float) – Probability of applying the anomaly transformation to an image. Default: 0.5.

  • blend_factor (float | tuple[float, float]) – Factor determining how much of the anomaly to blend with the original image. Can be a float or a tuple of (min, max). Default: (0.2, 1.0).

  • rotation_range (tuple[float, float]) – Range of rotation angles in degrees for the Perlin noise pattern. Default: (-90, 90).

Example

>>> # Single image usage with default parameters
>>> transform = PerlinAnomalyGenerator()
>>> image = torch.randn(3, 256, 256)  # [C, H, W]
>>> augmented_image, anomaly_mask = transform(image)
>>> print(augmented_image.shape)  # [C, H, W]
>>> print(anomaly_mask.shape)  # [1, H, W]
>>> # Batch usage with custom parameters
>>> transform = PerlinAnomalyGenerator(
...     probability=0.8,
...     blend_factor=0.5
... )
>>> batch = torch.randn(4, 3, 256, 256)  # [B, C, H, W]
>>> augmented_batch, anomaly_masks = transform(batch)
>>> print(augmented_batch.shape)  # [B, C, H, W]
>>> print(anomaly_masks.shape)  # [B, 1, H, W]
>>> # Using anomaly source images
>>> transform = PerlinAnomalyGenerator(
...     anomaly_source_path='path/to/anomaly/images',
...     probability=0.7,
...     blend_factor=(0.3, 0.9),
...     rotation_range=(-45, 45)
... )
>>> augmented_image, anomaly_mask = transform(image)
forward(img)#

Apply augmentation using the mask for single image or batch.

Parameters:

img (Tensor) – Input image tensor of shape [C, H, W] or batch tensor of shape [B, C, H, W].

Returns:

Tuple containing:
  • Augmented image tensor of same shape as input

  • Mask tensor of shape [1, H, W] or [B, 1, H, W]

Return type:

tuple[torch.Tensor, torch.Tensor]

generate_perturbation(height, width, device=None, anomaly_source_path=None)#

Generate perturbed image and mask.

Parameters:
  • height (int) – Height of the output image.

  • width (int) – Width of the output image.

  • device (device | None) – Device to generate the perturbation on.

  • anomaly_source_path (Path | str | None) – Optional path to source image for anomaly.

Returns:

Tuple containing:
  • Perturbation tensor of shape [H, W, C]

  • Mask tensor of shape [H, W, 1]

Return type:

tuple[torch.Tensor, torch.Tensor]

anomalib.data.utils.generators.generate_perlin_noise(height, width, scale=None, device=None)#

Generate a Perlin noise pattern.

This function generates a Perlin noise pattern using a grid-based gradient noise approach. The noise is generated by interpolating between randomly generated gradient vectors at grid vertices. The interpolation uses a quintic curve for smooth transitions.

Parameters:
  • height (int) – Desired height of the noise pattern.

  • width (int) – Desired width of the noise pattern.

  • scale (tuple[int, int] | None) – Tuple of (scale_x, scale_y) for noise granularity. If None, random scales will be used. Larger scales produce coarser noise patterns, while smaller scales produce finer patterns.

  • device (device | None) – Device to generate the noise on. If None, uses current default device.

Returns:

Tensor of shape [height, width] containing the noise

pattern, with values roughly in [-1, 1] range.

Return type:

torch.Tensor

Example

>>> # Generate 256x256 noise with default random scale
>>> noise = generate_perlin_noise(256, 256)
>>> print(noise.shape)
torch.Size([256, 256])
>>> # Generate 512x512 noise with fixed scale
>>> noise = generate_perlin_noise(512, 512, scale=(8, 8))
>>> print(noise.shape)
torch.Size([512, 512])
>>> # Generate noise on GPU if available
>>> device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
>>> noise = generate_perlin_noise(128, 128, device=device)

Dataset that generates synthetic anomalies.

This module provides functionality to generate synthetic anomalies when real anomalous data is scarce or unavailable. It includes:

  • A dataset class that generates synthetic anomalies from normal images

  • Functions to convert normal samples into synthetic anomalous samples

  • Perlin noise-based anomaly generation

  • Temporary file management for synthetic data

Example

>>> from anomalib.data.utils.synthetic import SyntheticAnomalyDataset
>>> # Create synthetic dataset from normal samples
>>> synthetic_dataset = SyntheticAnomalyDataset(
...     transform=transforms,
...     source_samples=normal_samples
... )
>>> len(synthetic_dataset)  # 50/50 normal/anomalous split
200
class anomalib.data.utils.synthetic.SyntheticAnomalyDataset(augmentations, source_samples)#

Bases: AnomalibDataset

Dataset for generating and managing synthetic anomalies.

The dataset creates synthetic anomalous images by applying Perlin noise-based perturbations to normal images. The synthetic images are stored in a temporary directory that is cleaned up when the dataset object is deleted.

Parameters:
  • augmentations (Transform | None) – Transform object describing the input data augmentations.

  • source_samples (DataFrame) – DataFrame containing normal samples used as source for synthetic anomalies.

Example

>>> transform = Compose([...])
>>> dataset = SyntheticAnomalyDataset(
...     transform=transform,
...     source_samples=normal_df
... )
>>> len(dataset)  # 50/50 normal/anomalous split
100
classmethod from_dataset(dataset)#

Create synthetic dataset from existing dataset of normal images.

Parameters:

dataset (AnomalibDataset) – Dataset containing only normal images to convert into a synthetic dataset with 50/50 normal/anomalous split.

Return type:

SyntheticAnomalyDataset

Returns:

New synthetic anomaly dataset.

Example

>>> normal_dataset = Dataset(...)
>>> synthetic = SyntheticAnomalyDataset.from_dataset(normal_dataset)
anomalib.data.utils.synthetic.make_synthetic_dataset(source_samples, image_dir, mask_dir, anomalous_ratio=0.5)#

Convert normal samples into a mixed set with synthetic anomalies.

The function generates synthetic anomalous images and their corresponding masks by applying Perlin noise-based perturbations to normal images.

Parameters:
  • source_samples (DataFrame) – DataFrame containing normal images used as source for synthetic anomalies. Must contain columns: image_path, label, label_index, mask_path, and split.

  • image_dir (Path) – Directory where synthetic anomalous images will be saved.

  • mask_dir (Path) – Directory where ground truth anomaly masks will be saved.

  • anomalous_ratio (float) – Fraction of source samples to convert to anomalous samples. Defaults to 0.5.

Return type:

DataFrame

Returns:

DataFrame containing both normal and synthetic anomalous samples.

Raises:

Example

>>> df = make_synthetic_dataset(
...     source_samples=normal_df,
...     image_dir=Path("./synthetic/images"),
...     mask_dir=Path("./synthetic/masks"),
...     anomalous_ratio=0.3
... )
>>> len(df[df.label == "abnormal"])  # 30% are anomalous
30