defenses¶
- trojanvision.defenses.add_argument(parser, defense_name=None, defense=None, class_dict=class_dict)[source]¶
 
- trojanvision.defenses.create(defense_name=None, defense=None, dataset_name=None, dataset=None, config=config, class_dict=class_dict, **kwargs)[source]¶
 
- class trojanvision.defenses.BackdoorDefense(attack, original=False, **kwargs)[source]¶
 Backdoor defense abstract class. It inherits
trojanzoo.defenses.Defense.- Parameters:
 original (bool) – Whether to load original clean model. If
False, load attack poisoned model by callingself.attack.load().- Variables:
 real_mark (torch.Tensor) – Watermark that the attacker uses with shape
(C+1, H, W).real_mask (torch.Tensor) – Mask of the watermark by calling
trojanvision.marks.Watermark.get_mask().
- class trojanvision.defenses.InputFiltering(defense_input_num=100, **kwargs)[source]¶
 Backdoor defense abstract class of input filtering. It inherits
trojanvision.defenses.BackdoorDefense.It detects whether a test input is poisoned.
The defense tests
defense_input_numclean test inputs and their corresponding poison version (2 * defense_input_numin total).- Parameters:
 defense_input_num (int) – Number of test inputs. Defaults to
100.- Variables:
 test_set (torch.utils.data.Dataset) – Test dataset with length
defense_input_num.
- get_pred_labels()[source]¶
 Get predicted labels for test inputs (need overriding).
- Returns:
 torch.Tensor –
torch.BoolTensorwith shape(2 * defense_input_num).
- class trojanvision.defenses.TrainingFiltering(defense_input_num=None, **kwargs)[source]¶
 Backdoor defense abstract class of training data filtering. It inherits
trojanvision.defenses.BackdoorDefense.Provided
defense_input_numtraining data, it detects which training data is poisoned.The defense evaluates clean and poison training inputs.
If
defense_input_numisNone, use full training data.Else, sample
defense_input_num * poison_percentpoison training data anddefense_input_num * (1 - poison_percent)clean training data.
If dataset is not using
train_mode == 'dataset', construct poison dataset using all clean data with watermark attached. (Ifdefense_input_numisNoneas well, the defense will evaluate the whole clean training set and its poisoned version.)- Parameters:
 defense_input_num (int) – Number of training inputs to evaluate. Defaults to
None(all training set).- Variables:
 clean_set (torch.utils.data.Dataset) – Clean training data to evaluate.
poison_set (torch.utils.data.Dataset) – Poison training data to evaluate.
- get_datasets()[source]¶
 Get clean and poison datasets.
- Returns:
 (torch.utils.data.Dataset, torch.utils.data.Dataset) – Clean training dataset and poison training dataset.
- class trojanvision.defenses.ModelInspection(defense_remask_epoch=10, defense_remask_lr=0.1, cost=1e-3, **kwargs)[source]¶
 Backdoor defense abstract class of model inspection. It inherits
trojanvision.defenses.BackdoorDefense.Provided a model, it tries to search for a trigger. If trigger exists, that means the model is poisoned.
- Parameters:
 - Variables:
 cost (float) – Cost of mask norm loss.
clean_set (torch.utils.data.Dataset) – Clean training data to evaluate.
poison_set (torch.utils.data.Dataset) – Poison training data to evaluate.
- check_early_stop(*args, **kwargs)[source]¶
 Check whether to early stop at the end of each remask epoch.
- Returns:
 bool – Whether to early stop. Defaults to
False.
- get_mark_loss_list(verbose=True, **kwargs)[source]¶
 Get list of mark, loss, asr of recovered trigger for each class.
- Parameters:
 verbose (bool) – Whether to output jaccard index for each trigger. It’s also passed to
optimize_mark().**kwargs – Keyword arguments passed to
optimize_mark().
- Returns:
 (torch.Tensor, list[float], list[float]) – list of mark, loss, asr with length
num_classes.
- load(path=None)[source]¶
 Load recovered mark from
path.- Parameters:
 path (str) – npz path of recovered mark. Defaults to
'{folder_path}/{self.get_filename()}.npz'.
- loss(_input, _label, target, trigger_output=None, **kwargs)[source]¶
 Loss function to optimize recovered trigger.
- Parameters:
 _input (torch.Tensor) – Clean input tensor with shape
(N, C, H, W)._label (torch.Tensor) – Clean label tensor with shape
(N).target (int) – Target class.
trigger_output (torch.Tensor) – Output tensor of input tensor with trigger. Defaults to
None.
- Returns:
 torch.Tensor – Scalar loss tensor.
- optimize_mark(label, loader=None, logger_header='', verbose=True, **kwargs)[source]¶
 - Parameters:
 label (int) – The class label to optimize.
loader (collections.abc.Iterable) – Data loader to optimize trigger. Defaults to
self.dataset.loader['train'].logger_header (str) – Header string of logger. Defaults to
''.verbose (bool) – Whether to use logger for output. Defaults to
True.**kwargs – Keyword arguments passed to
loss().
- Returns:
 (torch.Tensor, torch.Tensor) – Optimized mark tensor with shape
(C + 1, H, W)and loss tensor.