anomalib.data.btech

BTech Dataset.

This script contains PyTorch Lightning DataModule for the BTech dataset.

If the dataset is not on the file system, the script downloads and extracts the dataset and create PyTorch data objects.

Module Contents

Classes

BTechDataset

BTech PyTorch Dataset.

BTech

BTechDataModule Lightning Data Module.

Functions

make_btech_dataset(path: pathlib.Path, split: Optional[str] = None, split_ratio: float = 0.1, seed: int = 0, create_validation_set: bool = False) → pandas.core.frame.DataFrame

Create BTech samples by parsing the BTech data file structure.

Attributes

anomalib.data.btech.logger[source]
anomalib.data.btech.make_btech_dataset(path: pathlib.Path, split: Optional[str] = None, split_ratio: float = 0.1, seed: int = 0, create_validation_set: bool = False) pandas.core.frame.DataFrame[source]

Create BTech samples by parsing the BTech data file structure.

The files are expected to follow the structure:

path/to/dataset/split/category/image_filename.png path/to/dataset/ground_truth/category/mask_filename.png

Parameters
  • path (Path) – Path to dataset

  • split (str, optional) – Dataset split (ie., either train or test). Defaults to None.

  • split_ratio (float, optional) – Ratio to split normal training images and add to the test set in case test set doesn’t contain any normal images. Defaults to 0.1.

  • seed (int, optional) – Random seed to ensure reproducibility when splitting. Defaults to 0.

  • create_validation_set (bool, optional) – Boolean to create a validation set from the test set. BTech dataset does not contain a validation set. Those wanting to create a validation set could set this flag to True.

Example

The following example shows how to get training samples from BTech 01 category:

>>> root = Path('./BTech')
>>> category = '01'
>>> path = root / category
>>> path
PosixPath('BTech/01')
>>> samples = make_btech_dataset(path, split='train', split_ratio=0.1, seed=0)
>>> samples.head()
   path     split label image_path                  mask_path                     label_index
0  BTech/01 train 01    BTech/01/train/ok/105.bmp BTech/01/ground_truth/ok/105.png      0
1  BTech/01 train 01    BTech/01/train/ok/017.bmp BTech/01/ground_truth/ok/017.png      0
...
Returns

an output dataframe containing samples for the requested split (ie., train or test)

Return type

DataFrame

class anomalib.data.btech.BTechDataset(root: Union[pathlib.Path, str], category: str, pre_process: anomalib.pre_processing.PreProcessor, split: str, task: str = 'segmentation', seed: int = 0, create_validation_set: bool = False)[source]

Bases: torchvision.datasets.folder.VisionDataset

BTech PyTorch Dataset.

__len__(self) int[source]

Get length of the dataset.

__getitem__(self, index: int) Dict[str, Union[str, torch.Tensor]][source]

Get dataset item for the index index.

Parameters

index (int) – Index to get the item.

Returns

Dict of image tensor during training.

Otherwise, Dict containing image path, target path, image tensor, label and transformed bounding box.

Return type

Union[Dict[str, Tensor], Dict[str, Union[str, Tensor]]]

class anomalib.data.btech.BTech(root: str, category: str, image_size: Optional[Union[int, Tuple[int, int]]] = None, train_batch_size: int = 32, test_batch_size: int = 32, num_workers: int = 8, task: str = 'segmentation', transform_config_train: Optional[Union[str, albumentations.Compose]] = None, transform_config_val: Optional[Union[str, albumentations.Compose]] = None, seed: int = 0, create_validation_set: bool = False)[source]

Bases: pytorch_lightning.core.datamodule.LightningDataModule

BTechDataModule Lightning Data Module.

prepare_data(self) None[source]

Download the dataset if not available.

setup(self, stage: Optional[str] = None) None[source]

Setup train, validation and test data.

BTech dataset uses BTech dataset structure, which is the reason for using anomalib.data.btech.BTech class to get the dataset items.

Parameters

stage – Optional[str]: Train/Val/Test stages. (Default value = None)

train_dataloader(self) pytorch_lightning.utilities.types.TRAIN_DATALOADERS[source]

Get train dataloader.

val_dataloader(self) pytorch_lightning.utilities.types.EVAL_DATALOADERS[source]

Get validation dataloader.

test_dataloader(self) pytorch_lightning.utilities.types.EVAL_DATALOADERS[source]

Get test dataloader.

predict_dataloader(self) pytorch_lightning.utilities.types.EVAL_DATALOADERS[source]

Get predict dataloader.