attacks¶
- trojanvision.attacks.add_argument(parser, attack_name=None, attack=None, class_dict=class_dict)[source]¶
- Add attack arguments to argument parser.For specific arguments implementation, see
trojanzoo.attacks.Attack.add_argument()
.- Parameters:
parser (argparse.ArgumentParser) – The parser to add arguments.
attack_name (str) – The attack name.
attack (str | Attack) – The attack instance or attack name (as the alias of attack_name).
class_dict (dict[str, type[Attack]]) – Map from attack name to attack class. Defaults to
trojanvision.attacks.class_dict
.
- Returns:
argparse._ArgumentGroup – The argument group.
See also
- trojanvision.attacks.create(attack_name=None, attack=None, dataset_name=None, dataset=None, model_name=None, model=None, config=config, class_dict=class_dict, **kwargs)[source]¶
- Create an attack instance.For arguments not included in
kwargs
, use the default values inconfig
.The default value offolder_path
is'{attack_dir}/{dataset.data_type}/{dataset.name}/{model.name}/{attack.name}'
.For attack implementation, seetrojanzoo.attacks.Attack
.- Parameters:
attack_name (str) – The attack name.
attack (str | Attack) – The attack instance or attack name (as the alias of attack_name).
dataset_name (str) – The dataset name.
dataset (str | ImageSet) – Dataset instance or dataset name (as the alias of dataset_name).
model_name (str) – The model name.
model (str | ImageModel) – The model instance or model name (as the alias of model_name).
config (Config) – The default parameter config.
class_dict (dict[str, type[Attack]]) – Map from attack name to attack class. Defaults to
trojanvision.attacks.class_dict
.**kwargs – Keyword arguments passed to attack init method.
- Returns:
Attack – The attack instance.
See also
- class trojanvision.attacks.BackdoorAttack(mark=None, source_class=None, target_class=0, poison_percent=0.01, train_mode='batch', **kwargs)[source]¶
Backdoor attack abstract class. It inherits
trojanzoo.attacks.Attack
.Note
This class is actually equivalent to
trojanvision.attacks.BadNet
.BackdoorAttack attaches a provided watermark to some training images and inject them into training set with target label. After retraining, the model will classify images with watermark of certain/all classes into target class.
- Parameters:
mark (trojanvision.marks.Watermark) – The watermark instance.
target_class (int) – The target class that images with watermark will be misclassified as. Defaults to
0
.poison_percent (float) – Percentage of poisoning inputs in the whole training set. Defaults to
0.01
.train_mode (float) –
Training mode to inject backdoor. Choose from
['batch', 'dataset', 'loss']
. Defaults to'batch'
.'batch'
: For a clean batch, randomly pickedpoison_num
inputs, attach trigger on them, modify their labels and append to original batch.'dataset'
: Create a poisoned dataset and use the mixed dataset.'loss'
: For a clean batch, calculate the loss on clean data and the loss on poisoned data (all batch) and mix them usingpoison_percent
as weight.
- Variables:
poison_ratio (float) – The ratio of poison data divided by clean data.
poison_percent / (1 - poison_percent)
The number of poison data in each batch / dataset.
train_mode == 'batch' : poison_ratio * batch_size
train_mode == 'dataset': int(poison_ratio * len(train_set))
train_mode == 'loss' : N/A
poison_set (torch.utils.data.Dataset) – Poison dataset (no clean data)
if train_mode == 'dataset'
.
- add_mark(x, **kwargs)[source]¶
Add watermark to input tensor. Defaults to
trojanvision.marks.Watermark.add_mark()
.
- get_data(data, org=False, keep_org=True, poison_label=True, **kwargs)[source]¶
Get data.
- Parameters:
data (tuple[torch.Tensor, torch.Tensor]) – Tuple of input and label tensors.
org (bool) – Whether to return original clean data directly. Defaults to
False
.keep_org (bool) – Whether to keep original clean data in final results. If
False
, the results are all infected. Defaults toTrue
.poison_label (bool) – Whether to use target class label for poison data. Defaults to
True
.**kwargs – Any keyword argument (unused).
- Returns:
(torch.Tensor, torch.Tensor) – Result tuple of input and label tensors.
- get_filename(mark_alpha=None, target_class=None, **kwargs)[source]¶
Get filenames for current attack settings.
- get_neuron_jaccard(k=None, ratio=0.5)[source]¶
Get Jaccard Index of neuron activations for feature maps between normal inputs and poison inputs.
Find average top-k neuron indices of 2 kinds of feature maps
clean_idx and poison_idx
, and return
- get_poison_dataset(poison_label=True, poison_num=None, seed=None)[source]¶
Get poison dataset (no clean data).
- Parameters:
- Returns:
torch.utils.data.Dataset – Poison dataset (no clean data).
- class trojanvision.attacks.CleanLabelBackdoor(*args, train_mode='dataset', **kwargs)[source]¶
Backdoor attack abstract class of clean label. It inherits
trojanvision.attacks.BackdoorAttack
.Under clean-label setting, only clean inputs from target class are infected, while the distortion is negligible for human to detect.