anomalib.data.btech¶
BTech Dataset.
This script contains PyTorch Lightning DataModule for the BTech dataset.
If the dataset is not on the file system, the script downloads and extracts the dataset and create PyTorch data objects.
Module Contents¶
Classes¶
BTech PyTorch Dataset. |
|
BTechDataModule Lightning Data Module. |
Functions¶
|
Create BTech samples by parsing the BTech data file structure. |
Attributes¶
- anomalib.data.btech.make_btech_dataset(path: pathlib.Path, split: Optional[str] = None, split_ratio: float = 0.1, seed: Optional[int] = None, create_validation_set: bool = False) pandas.core.frame.DataFrame[source]¶
Create BTech samples by parsing the BTech data file structure.
- The files are expected to follow the structure:
path/to/dataset/split/category/image_filename.png path/to/dataset/ground_truth/category/mask_filename.png
- Parameters
path (Path) – Path to dataset
split (str, optional) – Dataset split (ie., either train or test). Defaults to None.
split_ratio (float, optional) – Ratio to split normal training images and add to the test set in case test set doesn’t contain any normal images. Defaults to 0.1.
seed (int, optional) – Random seed to ensure reproducibility when splitting. Defaults to 0.
create_validation_set (bool, optional) – Boolean to create a validation set from the test set. BTech dataset does not contain a validation set. Those wanting to create a validation set could set this flag to
True.
Example
The following example shows how to get training samples from BTech 01 category:
>>> root = Path('./BTech') >>> category = '01' >>> path = root / category >>> path PosixPath('BTech/01')
>>> samples = make_btech_dataset(path, split='train', split_ratio=0.1, seed=0) >>> samples.head() path split label image_path mask_path label_index 0 BTech/01 train 01 BTech/01/train/ok/105.bmp BTech/01/ground_truth/ok/105.png 0 1 BTech/01 train 01 BTech/01/train/ok/017.bmp BTech/01/ground_truth/ok/017.png 0 ...
- Returns
an output dataframe containing samples for the requested split (ie., train or test)
- Return type
DataFrame
- class anomalib.data.btech.BTechDataset(root: Union[pathlib.Path, str], category: str, pre_process: anomalib.pre_processing.PreProcessor, split: str, task: str = 'segmentation', seed: Optional[int] = None, create_validation_set: bool = False)[source]¶
Bases:
torchvision.datasets.folder.VisionDatasetBTech PyTorch Dataset.
- __getitem__(index: int) Dict[str, Union[str, torch.Tensor]][source]¶
Get dataset item for the index
index.- Parameters
index (int) – Index to get the item.
- Returns
- Dict of image tensor during training.
Otherwise, Dict containing image path, target path, image tensor, label and transformed bounding box.
- Return type
Union[Dict[str, Tensor], Dict[str, Union[str, Tensor]]]
- class anomalib.data.btech.BTech(root: str, category: str, image_size: Optional[Union[int, Tuple[int, int]]] = None, train_batch_size: int = 32, test_batch_size: int = 32, num_workers: int = 8, task: str = 'segmentation', transform_config_train: Optional[Union[str, albumentations.Compose]] = None, transform_config_val: Optional[Union[str, albumentations.Compose]] = None, seed: Optional[int] = None, create_validation_set: bool = False)[source]¶
Bases:
pytorch_lightning.core.datamodule.LightningDataModuleBTechDataModule Lightning Data Module.
- setup(stage: Optional[str] = None) None[source]¶
Setup train, validation and test data.
BTech dataset uses BTech dataset structure, which is the reason for using anomalib.data.btech.BTech class to get the dataset items.
- Parameters
stage – Optional[str]: Train/Val/Test stages. (Default value = None)
- train_dataloader() pytorch_lightning.utilities.types.TRAIN_DATALOADERS[source]¶
Get train dataloader.