Shortcuts

attacks

trojanvision.attacks.add_argument(parser, attack_name=None, attack=None, class_dict=class_dict)[source]
Add attack arguments to argument parser.
For specific arguments implementation, see trojanzoo.attacks.Attack.add_argument().
Parameters:
  • parser (argparse.ArgumentParser) – The parser to add arguments.

  • attack_name (str) – The attack name.

  • attack (str | Attack) – The attack instance or attack name (as the alias of attack_name).

  • class_dict (dict[str, type[Attack]]) – Map from attack name to attack class. Defaults to trojanvision.attacks.class_dict.

Returns:

argparse._ArgumentGroup – The argument group.

trojanvision.attacks.create(attack_name=None, attack=None, dataset_name=None, dataset=None, model_name=None, model=None, config=config, class_dict=class_dict, **kwargs)[source]
Create an attack instance.
For arguments not included in kwargs, use the default values in config.
The default value of folder_path is '{attack_dir}/{dataset.data_type}/{dataset.name}/{model.name}/{attack.name}'.
For attack implementation, see trojanzoo.attacks.Attack.
Parameters:
  • attack_name (str) – The attack name.

  • attack (str | Attack) – The attack instance or attack name (as the alias of attack_name).

  • dataset_name (str) – The dataset name.

  • dataset (str | ImageSet) – Dataset instance or dataset name (as the alias of dataset_name).

  • model_name (str) – The model name.

  • model (str | ImageModel) – The model instance or model name (as the alias of model_name).

  • config (Config) – The default parameter config.

  • class_dict (dict[str, type[Attack]]) – Map from attack name to attack class. Defaults to trojanvision.attacks.class_dict.

  • **kwargs – Keyword arguments passed to attack init method.

Returns:

Attack – The attack instance.

class trojanvision.attacks.BackdoorAttack(mark=None, source_class=None, target_class=0, poison_percent=0.01, train_mode='batch', **kwargs)[source]

Backdoor attack abstract class. It inherits trojanzoo.attacks.Attack.

Note

This class is actually equivalent to trojanvision.attacks.BadNet.

BackdoorAttack attaches a provided watermark to some training images and inject them into training set with target label. After retraining, the model will classify images with watermark of certain/all classes into target class.

Parameters:
  • mark (trojanvision.marks.Watermark) – The watermark instance.

  • target_class (int) – The target class that images with watermark will be misclassified as. Defaults to 0.

  • poison_percent (float) – Percentage of poisoning inputs in the whole training set. Defaults to 0.01.

  • train_mode (float) –

    Training mode to inject backdoor. Choose from ['batch', 'dataset', 'loss']. Defaults to 'batch'.

    • 'batch': For a clean batch, randomly picked poison_num inputs, attach trigger on them, modify their labels and append to original batch.

    • 'dataset': Create a poisoned dataset and use the mixed dataset.

    • 'loss': For a clean batch, calculate the loss on clean data and the loss on poisoned data (all batch) and mix them using poison_percent as weight.

Variables:
  • poison_ratio (float) – The ratio of poison data divided by clean data. poison_percent / (1 - poison_percent)

  • poison_num (float | int) –

    The number of poison data in each batch / dataset.

    • train_mode == 'batch'  : poison_ratio * batch_size

    • train_mode == 'dataset': int(poison_ratio * len(train_set))

    • train_mode == 'loss'   : N/A

  • poison_set (torch.utils.data.Dataset) – Poison dataset (no clean data) if train_mode == 'dataset'.

add_mark(x, **kwargs)[source]

Add watermark to input tensor. Defaults to trojanvision.marks.Watermark.add_mark().

get_data(data, org=False, keep_org=True, poison_label=True, **kwargs)[source]

Get data.

Parameters:
  • data (tuple[torch.Tensor, torch.Tensor]) – Tuple of input and label tensors.

  • org (bool) – Whether to return original clean data directly. Defaults to False.

  • keep_org (bool) – Whether to keep original clean data in final results. If False, the results are all infected. Defaults to True.

  • poison_label (bool) – Whether to use target class label for poison data. Defaults to True.

  • **kwargs – Any keyword argument (unused).

Returns:

(torch.Tensor, torch.Tensor) – Result tuple of input and label tensors.

get_filename(mark_alpha=None, target_class=None, **kwargs)[source]

Get filenames for current attack settings.

get_neuron_jaccard(k=None, ratio=0.5)[source]

Get Jaccard Index of neuron activations for feature maps between normal inputs and poison inputs.

Find average top-k neuron indices of 2 kinds of feature maps clean_idx and poison_idx, and return len(clean_idx & poison_idx)len(clean_idx | poison_idx)\frac{\text{len(clean\_idx \& poison\_idx)}}{\text{len(clean\_idx | poison\_idx)}}

Parameters:
  • k (int) – Top-k neurons to calculate jaccard index. Defaults to None.

  • ratio (float) – Percentage of neurons if k is not provided. Defaults to 0.5.

Returns:

float – Jaccard Index.

get_poison_dataset(poison_label=True, poison_num=None, seed=None)[source]

Get poison dataset (no clean data).

Parameters:
  • poison_label (bool) – Whether to use target poison label for poison data. Defaults to True.

  • poison_num (int) – Number of poison data. Defaults to round(self.poison_ratio * len(train_set))

  • seed (int) – Random seed to sample poison input indices. Defaults to env['data_seed'].

Returns:

torch.utils.data.Dataset – Poison dataset (no clean data).

load(filename=None, **kwargs)[source]

Load attack results from previously saved files.

save(filename=None, **kwargs)[source]

Save attack results to files.

validate_confidence(mode='valid', success_only=True)[source]

Get self.target_class confidence on dataset of mode.

Parameters:
  • mode (str) – Dataset mode. Defaults to 'valid'.

  • success_only (bool) – Whether to only measure confidence on attack-successful inputs. Defaults to True.

Returns:

float – Average confidence of self.target_class.

class trojanvision.attacks.CleanLabelBackdoor(*args, train_mode='dataset', **kwargs)[source]

Backdoor attack abstract class of clean label. It inherits trojanvision.attacks.BackdoorAttack.

Under clean-label setting, only clean inputs from target class are infected, while the distortion is negligible for human to detect.

get_poison_dataset(poison_num=None, load_mark=True, seed=None)[source]

Get poison dataset from target class (no clean data).

Parameters:
  • poison_num (int) – Number of poison data. Defaults to self.poison_num

  • load_mark (bool) – Whether to load previously saved watermark. This should be False during attack. Defaults to True.

  • seed (int) – Random seed to sample poison input indices. Defaults to env['data_seed'].

Returns:

torch.utils.data.Dataset – Poison dataset from target class (no clean data).

class trojanvision.attacks.DynamicBackdoor(mark=None, source_class=None, target_class=0, poison_percent=0.01, train_mode='batch', **kwargs)[source]

Docs

Access comprehensive developer documentation for TrojanZoo

View Docs