:py:mod:`anomalib.data.mvtec`
=============================

.. py:module:: anomalib.data.mvtec

.. autoapi-nested-parse::

   MVTec AD Dataset (CC BY-NC-SA 4.0).

   Description:
       This script contains PyTorch Dataset, Dataloader and PyTorch
           Lightning DataModule for the MVTec AD dataset.

       If the dataset is not on the file system, the script downloads and
           extracts the dataset and create PyTorch data objects.

   License:
       MVTec AD dataset is released under the Creative Commons
       Attribution-NonCommercial-ShareAlike 4.0 International License
       (CC BY-NC-SA 4.0)(https://creativecommons.org/licenses/by-nc-sa/4.0/).

   Reference:
       - Paul Bergmann, Kilian Batzner, Michael Fauser, David Sattlegger, Carsten Steger:
         The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for
         Unsupervised Anomaly Detection; in: International Journal of Computer Vision
         129(4):1038-1059, 2021, DOI: 10.1007/s11263-020-01400-4.

       - Paul Bergmann, Michael Fauser, David Sattlegger, Carsten Steger: MVTec AD —
         A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection;
         in: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),
         9584-9592, 2019, DOI: 10.1109/CVPR.2019.00982.


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   anomalib.data.mvtec.MVTecDataset
   anomalib.data.mvtec.MVTec


Functions
~~~~~~~~~

.. autoapisummary::

   anomalib.data.mvtec.make_mvtec_dataset


Attributes
~~~~~~~~~~

.. autoapisummary::

   anomalib.data.mvtec.logger


.. py:data:: logger
   

.. py:function:: make_mvtec_dataset(path: pathlib.Path, split: Optional[str] = None, split_ratio: float = 0.1, seed: int = 0, create_validation_set: bool = False) -> pandas.core.frame.DataFrame

   Create MVTec AD samples by parsing the MVTec AD data file structure.

   The files are expected to follow the structure:
       path/to/dataset/split/category/image_filename.png
       path/to/dataset/ground_truth/category/mask_filename.png

   This function creates a dataframe to store the parsed information based on the following format:
   |---|---------------|-------|---------|---------------|---------------------------------------|-------------|
   |   | path          | split | label   | image_path    | mask_path                             | label_index |
   |---|---------------|-------|---------|---------------|---------------------------------------|-------------|
   | 0 | datasets/name |  test |  defect |  filename.png | ground_truth/defect/filename_mask.png | 1           |
   |---|---------------|-------|---------|---------------|---------------------------------------|-------------|

   :param path: Path to dataset
   :type path: Path
   :param split: Dataset split (ie., either train or test). Defaults to None.
   :type split: str, optional
   :param split_ratio: Ratio to split normal training images and add to the
                       test set in case test set doesn't contain any normal images.
                       Defaults to 0.1.
   :type split_ratio: float, optional
   :param seed: Random seed to ensure reproducibility when splitting. Defaults to 0.
   :type seed: int, optional
   :param create_validation_set: Boolean to create a validation set from the test set.
                                 MVTec AD dataset does not contain a validation set. Those wanting to create a validation set
                                 could set this flag to ``True``.
   :type create_validation_set: bool, optional

   .. rubric:: Example

   The following example shows how to get training samples from MVTec AD bottle category:

   >>> root = Path('./MVTec')
   >>> category = 'bottle'
   >>> path = root / category
   >>> path
   PosixPath('MVTec/bottle')

   >>> samples = make_mvtec_dataset(path, split='train', split_ratio=0.1, seed=0)
   >>> samples.head()
      path         split label image_path                           mask_path                   label_index
   0  MVTec/bottle train good MVTec/bottle/train/good/105.png MVTec/bottle/ground_truth/good/105_mask.png 0
   1  MVTec/bottle train good MVTec/bottle/train/good/017.png MVTec/bottle/ground_truth/good/017_mask.png 0
   2  MVTec/bottle train good MVTec/bottle/train/good/137.png MVTec/bottle/ground_truth/good/137_mask.png 0
   3  MVTec/bottle train good MVTec/bottle/train/good/152.png MVTec/bottle/ground_truth/good/152_mask.png 0
   4  MVTec/bottle train good MVTec/bottle/train/good/109.png MVTec/bottle/ground_truth/good/109_mask.png 0

   :returns: an output dataframe containing samples for the requested split (ie., train or test)
   :rtype: DataFrame


.. py:class:: MVTecDataset(root: Union[pathlib.Path, str], category: str, pre_process: anomalib.pre_processing.PreProcessor, split: str, task: str = 'segmentation', seed: int = 0, create_validation_set: bool = False)

   Bases: :py:obj:`torchvision.datasets.folder.VisionDataset`

   MVTec AD PyTorch Dataset.

   .. py:method:: __len__(self) -> int

      Get length of the dataset.


   .. py:method:: __getitem__(self, index: int) -> Dict[str, Union[str, torch.Tensor]]

      Get dataset item for the index ``index``.

      :param index: Index to get the item.
      :type index: int

      :returns:

                Dict of image tensor during training.
                    Otherwise, Dict containing image path, target path, image tensor, label and transformed bounding box.
      :rtype: Union[Dict[str, Tensor], Dict[str, Union[str, Tensor]]]


.. py:class:: MVTec(root: str, category: str, image_size: Optional[Union[int, Tuple[int, int]]] = None, train_batch_size: int = 32, test_batch_size: int = 32, num_workers: int = 8, task: str = 'segmentation', transform_config_train: Optional[Union[str, albumentations.Compose]] = None, transform_config_val: Optional[Union[str, albumentations.Compose]] = None, seed: int = 0, create_validation_set: bool = False)

   Bases: :py:obj:`pytorch_lightning.core.datamodule.LightningDataModule`

   MVTec AD Lightning Data Module.

   .. py:method:: prepare_data(self) -> None

      Download the dataset if not available.


   .. py:method:: setup(self, stage: Optional[str] = None) -> None

      Setup train, validation and test data.

      :param stage: Optional[str]:  Train/Val/Test stages. (Default value = None)


   .. py:method:: train_dataloader(self) -> pytorch_lightning.utilities.types.TRAIN_DATALOADERS

      Get train dataloader.


   .. py:method:: val_dataloader(self) -> pytorch_lightning.utilities.types.EVAL_DATALOADERS

      Get validation dataloader.


   .. py:method:: test_dataloader(self) -> pytorch_lightning.utilities.types.EVAL_DATALOADERS

      Get test dataloader.


   .. py:method:: predict_dataloader(self) -> pytorch_lightning.utilities.types.EVAL_DATALOADERS

      Get predict dataloader.