datasets¶
- trojanvision.datasets.add_argument(parser, dataset_name=None, dataset=None, config=config, class_dict=class_dict)[source]¶
 - Add image dataset arguments to argument parser.For specific arguments implementation, see
ImageSet.add_argument().- Parameters:
 parser (argparse.ArgumentParser) – The parser to add arguments.
dataset_name (str) – The dataset name.
dataset (str | Dataset) – Dataset instance or dataset name (as the alias of dataset_name).
config (Config) – The default parameter config, which contains the default dataset name if not provided.
class_dict (dict[str, type[Dataset]]) – Map from dataset name to dataset class. Defaults to
trojanvision.datasets.class_dict.
See also
 
- trojanvision.datasets.create(dataset_name=None, dataset=None, config=config, class_dict=class_dict, **kwargs)[source]¶
 - Create a image dataset instance.For arguments not included in
kwargs, use the default values inconfig.The default value offolder_pathis'{data_dir}/{data_type}/{name}'.For dataset implementation, seeImageSet.- Parameters:
 dataset_name (str) – The dataset name.
dataset (str) – The alias of dataset_name.
config (Config) – The default parameter config.
class_dict (dict[str, type[ImageSet]]) – Map from dataset name to dataset class. Defaults to
trojanvision.datasets.class_dict.**kwargs – Keyword arguments passed to dataset init method.
- Returns:
 ImageSet – Image dataset instance.
See also
 
- class trojanvision.datasets.ImageSet(norm_par=None, normalize=False, transform=None, auto_augment=False, mixup=False, mixup_alpha=0.0, cutmix=False, cutmix_alpha=0.0, cutout=False, cutout_length=None, **kwargs)[source]¶
 - The basic class representing an image dataset.It inherits
trojanzoo.datasets.Dataset.Note
This is the implementation of dataset. For users, please use
create()instead, which is more user-friendly.- Parameters:
 norm_par (dict[str, list[float]]) – Data normalization parameters of
'mean'and'std'(e.g.,{'mean': [0.5, 0.4, 0.6], 'std': [0.2, 0.3, 0.1]}). Defaults toNone.normalize (bool) – Whether to use
torchvision.transforms.Normalizein dataset transform. Otherwise, use it as model preprocess layer.transform (str) –
The dataset transform type.
None |'none'(torchvision.transforms.PILToTensorandtorchvision.transforms.ConvertImageDtype)'bit'(transform used in BiT network)'pytorch'(pytorch transform used in ImageNet training).
Defaults to
None.Note
See
get_transform()to get more details.auto_augment (bool) – Whether to use
torchvision.transforms.AutoAugment. Defaults toFalse.mixup (bool) – Whether to use
trojanvision.utils.transforms.RandomMixup. Defaults toFalse.mixup_alpha (float) –
alphapassed totrojanvision.utils.transforms.RandomMixup. Defaults to0.0.cutmix (bool) – Whether to use
trojanvision.utils.transforms.RandomCutmix. Defaults toFalse.cutmix_alpha (float) –
alphapassed totrojanvision.utils.transforms.RandomCutmix. Defaults to0.0.cutout (bool) – Whether to use
trojanvision.utils.transforms.Cutout. Defaults toFalse.cutout_length (int) – Cutout length. Defaults to
None.**kwargs – keyword argument passed to
trojanzoo.datasets.Dataset.
- Variables:
 
- classmethod add_argument(group)[source]¶
 Add image dataset arguments to argument parser group. View source to see specific arguments.
Note
This is the implementation of adding arguments. The concrete dataset class may override this method to add more arguments. For users, please use
add_argument()instead, which is more user-friendly.
- static get_data(data, **kwargs)[source]¶
 Process image data. Defaults to put input and label on
env['device']withnon_blockingand transform label totorch.LongTensor.- Parameters:
 data (tuple[torch.Tensor, torch.Tensor]) – Tuple of batched input and label.
**kwargs – Any keyword argument (unused).
- Returns:
 (tuple[torch.Tensor, torch.Tensor]) – Tuple of batched input and label on
env['device']. Label is transformed totorch.LongTensor.
- get_transform(mode, normalize=None)[source]¶
 Get dataset transform based on
self.transform.None |'none'(torchvision.transforms.PILToTensorandtorchvision.transforms.ConvertImageDtype)'bit'(transform used in BiT network)'pytorch'(pytorch transform used in ImageNet training).
- Parameters:
 mode (str) – The dataset mode (e.g.,
'train' | 'valid').normalize (bool | None) – Whether to use
torchvision.transforms.Normalizein dataset transform. Defaults toself.normalize.
- Returns:
 torchvision.transforms.Compose – The transform sequence.
- make_folder(img_type='.png', **kwargs)[source]¶
 Save the dataset to
self.folder_pathastrojanvision.datasets.ImageFolderformat.'{self.folder_path}/{self.name}/{mode}/{class_name}/{img_idx}.png'- Parameters:
 img_type (str) – The image types to save. Defaults to
'.png'.
 
- class trojanvision.datasets.ImageFolder(data_format='folder', memory=False, **kwargs)[source]¶
 Image folder class which inherits
trojanvision.datasets.ImageSet.See also
- Variables:
 ext (Param[str, str]) – Map from mode to downloaded file extension.
md5 (dict[str, str]) – Map from mode to downloaded file md5.
org_folder_name (dict[str, str]) – Map from mode to extracted folder name of downloaded file.
data_format (str) –
File format of dataset.
'folder'(default)'tar''zip'
memory (bool) – Whether to put all dataset into memory at initialization. Defaults to
False.
- classmethod add_argument(group)[source]¶
 Add image dataset arguments to argument parser group. View source to see specific arguments.
Note
This is the implementation of adding arguments. The concrete dataset class may override this method to add more arguments. For users, please use
add_argument()instead, which is more user-friendly.
- initialize(*args, **kwargs)[source]¶
 You could use this method to transform across different
data_format.
- sample(child_name=None, class_dict=None, sample_num=None, method='folder')[source]¶
 Sample a subset image folder dataset.
- Parameters:
 child_name (str) – Name of child subset. Defaults to
'{self.name}_sample{sample_num}'class_dict (dict[str, list[str]] | None) – Map from new class name to list of old class names. If
None, usesample_numto random sample a subset (1 to 1). Defaults toNone.sample_num (int | None) – The number of subset classes to sample if
class_dictisNone. Defaults toNone.method (str) –
data_formatof new subset to save. Defaults to'folder'.