Shortcuts

datasets

trojanvision.datasets.add_argument(parser, dataset_name=None, dataset=None, config=config, class_dict=class_dict)[source]
Add image dataset arguments to argument parser.
For specific arguments implementation, see ImageSet.add_argument().
Parameters:
  • parser (argparse.ArgumentParser) – The parser to add arguments.

  • dataset_name (str) – The dataset name.

  • dataset (str | Dataset) – Dataset instance or dataset name (as the alias of dataset_name).

  • config (Config) – The default parameter config, which contains the default dataset name if not provided.

  • class_dict (dict[str, type[Dataset]]) – Map from dataset name to dataset class. Defaults to trojanvision.datasets.class_dict.

trojanvision.datasets.create(dataset_name=None, dataset=None, config=config, class_dict=class_dict, **kwargs)[source]
Create a image dataset instance.
For arguments not included in kwargs, use the default values in config.
The default value of folder_path is '{data_dir}/{data_type}/{name}'.
For dataset implementation, see ImageSet.
Parameters:
  • dataset_name (str) – The dataset name.

  • dataset (str) – The alias of dataset_name.

  • config (Config) – The default parameter config.

  • class_dict (dict[str, type[ImageSet]]) – Map from dataset name to dataset class. Defaults to trojanvision.datasets.class_dict.

  • **kwargs – Keyword arguments passed to dataset init method.

Returns:

ImageSet – Image dataset instance.

class trojanvision.datasets.ImageSet(norm_par=None, normalize=False, transform=None, auto_augment=False, mixup=False, mixup_alpha=0.0, cutmix=False, cutmix_alpha=0.0, cutout=False, cutout_length=None, **kwargs)[source]
The basic class representing an image dataset.

Note

This is the implementation of dataset. For users, please use create() instead, which is more user-friendly.

Parameters:
  • norm_par (dict[str, list[float]]) – Data normalization parameters of 'mean' and 'std' (e.g., {'mean': [0.5, 0.4, 0.6], 'std': [0.2, 0.3, 0.1]}). Defaults to None.

  • normalize (bool) – Whether to use torchvision.transforms.Normalize in dataset transform. Otherwise, use it as model preprocess layer.

  • transform (str) –

    The dataset transform type.

    Defaults to None.

    Note

    See get_transform() to get more details.

  • auto_augment (bool) – Whether to use torchvision.transforms.AutoAugment. Defaults to False.

  • mixup (bool) – Whether to use trojanvision.utils.transforms.RandomMixup. Defaults to False.

  • mixup_alpha (float) – alpha passed to trojanvision.utils.transforms.RandomMixup. Defaults to 0.0.

  • cutmix (bool) – Whether to use trojanvision.utils.transforms.RandomCutmix. Defaults to False.

  • cutmix_alpha (float) – alpha passed to trojanvision.utils.transforms.RandomCutmix. Defaults to 0.0.

  • cutout (bool) – Whether to use trojanvision.utils.transforms.Cutout. Defaults to False.

  • cutout_length (int) – Cutout length. Defaults to None.

  • **kwargs – keyword argument passed to trojanzoo.datasets.Dataset.

Variables:
  • data_type (str) – Defaults to 'image'.

  • num_classes (int) – Defaults to 1000.

  • data_shape (list[int]) – The shape of image data [C, H, W]. Defaults to [3, 224, 224].

classmethod add_argument(group)[source]

Add image dataset arguments to argument parser group. View source to see specific arguments.

Note

This is the implementation of adding arguments. The concrete dataset class may override this method to add more arguments. For users, please use add_argument() instead, which is more user-friendly.

static get_data(data, **kwargs)[source]

Process image data. Defaults to put input and label on env['device'] with non_blocking and transform label to torch.LongTensor.

Parameters:
Returns:

(tuple[torch.Tensor, torch.Tensor]) – Tuple of batched input and label on env['device']. Label is transformed to torch.LongTensor.

get_transform(mode, normalize=None)[source]

Get dataset transform based on self.transform.

Parameters:
Returns:

torchvision.transforms.Compose – The transform sequence.

make_folder(img_type='.png', **kwargs)[source]

Save the dataset to self.folder_path as trojanvision.datasets.ImageFolder format.

'{self.folder_path}/{self.name}/{mode}/{class_name}/{img_idx}.png'

Parameters:

img_type (str) – The image types to save. Defaults to '.png'.

class trojanvision.datasets.ImageFolder(data_format='folder', memory=False, **kwargs)[source]

Image folder class which inherits trojanvision.datasets.ImageSet.

Variables:
  • url (dict[str, str]) – links to data files.

  • ext (Param[str, str]) – Map from mode to downloaded file extension.

  • md5 (dict[str, str]) – Map from mode to downloaded file md5.

  • org_folder_name (dict[str, str]) – Map from mode to extracted folder name of downloaded file.

  • data_format (str) –

    File format of dataset.

    • 'folder' (default)

    • 'tar'

    • 'zip'

  • memory (bool) – Whether to put all dataset into memory at initialization. Defaults to False.

classmethod add_argument(group)[source]

Add image dataset arguments to argument parser group. View source to see specific arguments.

Note

This is the implementation of adding arguments. The concrete dataset class may override this method to add more arguments. For users, please use add_argument() instead, which is more user-friendly.

initialize(*args, **kwargs)[source]

You could use this method to transform across different data_format.

sample(child_name=None, class_dict=None, sample_num=None, method='folder')[source]

Sample a subset image folder dataset.

Parameters:
  • child_name (str) – Name of child subset. Defaults to '{self.name}_sample{sample_num}'

  • class_dict (dict[str, list[str]] | None) – Map from new class name to list of old class names. If None, use sample_num to random sample a subset (1 to 1). Defaults to None.

  • sample_num (int | None) – The number of subset classes to sample if class_dict is None. Defaults to None.

  • method (str) – data_format of new subset to save. Defaults to 'folder'.

Docs

Access comprehensive developer documentation for TrojanZoo

View Docs