Shortcuts

defenses

trojanvision.defenses.add_argument(parser, defense_name=None, defense=None, class_dict=class_dict)[source]
trojanvision.defenses.create(defense_name=None, defense=None, dataset_name=None, dataset=None, config=config, class_dict=class_dict, **kwargs)[source]
class trojanvision.defenses.BackdoorDefense(attack, original=False, **kwargs)[source]

Backdoor defense abstract class. It inherits trojanzoo.defenses.Defense.

Parameters:

original (bool) – Whether to load original clean model. If False, load attack poisoned model by calling self.attack.load().

Variables:
  • real_mark (torch.Tensor) – Watermark that the attacker uses with shape (C+1, H, W).

  • real_mask (torch.Tensor) – Mask of the watermark by calling trojanvision.marks.Watermark.get_mask().

get_filename(**kwargs)[source]

Get filenames for current defense settings.

class trojanvision.defenses.InputFiltering(defense_input_num=100, **kwargs)[source]

Backdoor defense abstract class of input filtering. It inherits trojanvision.defenses.BackdoorDefense.

It detects whether a test input is poisoned.

The defense tests defense_input_num clean test inputs and their corresponding poison version (2 * defense_input_num in total).

Parameters:

defense_input_num (int) – Number of test inputs. Defaults to 100.

Variables:

test_set (torch.utils.data.Dataset) – Test dataset with length defense_input_num.

get_pred_labels()[source]

Get predicted labels for test inputs (need overriding).

Returns:

torch.Tensortorch.BoolTensor with shape (2 * defense_input_num).

get_test_data()[source]

Get test data.

Returns:

(torch.Tensor, torch.Tensor) – Input and label tensors with length defense_input_num.

get_true_labels()[source]

Get ground-truth labels for test inputs.

Defaults to return [False] * defense_input_num + [True] * defense_input_num.

Returns:

torch.Tensortorch.BoolTensor with shape (2 * defense_input_num).

class trojanvision.defenses.TrainingFiltering(defense_input_num=None, **kwargs)[source]

Backdoor defense abstract class of training data filtering. It inherits trojanvision.defenses.BackdoorDefense.

Provided defense_input_num training data, it detects which training data is poisoned.

The defense evaluates clean and poison training inputs.

  • If defense_input_num is None, use full training data.

  • Else, sample defense_input_num * poison_percent poison training data and defense_input_num * (1 - poison_percent) clean training data.

If dataset is not using train_mode == 'dataset', construct poison dataset using all clean data with watermark attached. (If defense_input_num is None as well, the defense will evaluate the whole clean training set and its poisoned version.)

Parameters:

defense_input_num (int) – Number of training inputs to evaluate. Defaults to None (all training set).

Variables:
get_datasets()[source]

Get clean and poison datasets.

Returns:

(torch.utils.data.Dataset, torch.utils.data.Dataset) – Clean training dataset and poison training dataset.

abstract get_pred_labels()[source]

Get predicted labels for training inputs (need overriding).

Returns:

torch.Tensortorch.BoolTensor with shape (defense_input_num).

get_true_labels()[source]

Get ground-truth labels for training inputs.

Defaults to return [False] * len(self.clean_set) + [True] * len(self.poison_set).

Returns:

torch.Tensortorch.BoolTensor with shape (defense_input_num).

class trojanvision.defenses.ModelInspection(defense_remask_epoch=10, defense_remask_lr=0.1, cost=1e-3, **kwargs)[source]

Backdoor defense abstract class of model inspection. It inherits trojanvision.defenses.BackdoorDefense.

Provided a model, it tries to search for a trigger. If trigger exists, that means the model is poisoned.

Parameters:
  • defense_remask_epoch (int) – Defense watermark optimizing epochs. Defaults to 10.

  • defense_remask_lr (float) – Defense watermark optimizing learning rate. Defaults to 0.1.

  • cost (float) – Cost of mask norm loss. Defaults to 1e-3.

Variables:
check_early_stop(*args, **kwargs)[source]

Check whether to early stop at the end of each remask epoch.

Returns:

bool – Whether to early stop. Defaults to False.

get_mark_loss_list(verbose=True, **kwargs)[source]

Get list of mark, loss, asr of recovered trigger for each class.

Parameters:
Returns:

(torch.Tensor, list[float], list[float]) – list of mark, loss, asr with length num_classes.

load(path=None)[source]

Load recovered mark from path.

Parameters:

path (str) – npz path of recovered mark. Defaults to '{folder_path}/{self.get_filename()}.npz'.

loss(_input, _label, target, trigger_output=None, **kwargs)[source]

Loss function to optimize recovered trigger.

Parameters:
  • _input (torch.Tensor) – Clean input tensor with shape (N, C, H, W).

  • _label (torch.Tensor) – Clean label tensor with shape (N).

  • target (int) – Target class.

  • trigger_output (torch.Tensor) – Output tensor of input tensor with trigger. Defaults to None.

Returns:

torch.Tensor – Scalar loss tensor.

optimize_mark(label, loader=None, logger_header='', verbose=True, **kwargs)[source]
Parameters:
  • label (int) – The class label to optimize.

  • loader (collections.abc.Iterable) – Data loader to optimize trigger. Defaults to self.dataset.loader['train'].

  • logger_header (str) – Header string of logger. Defaults to ''.

  • verbose (bool) – Whether to use logger for output. Defaults to True.

  • **kwargs – Keyword arguments passed to loss().

Returns:

(torch.Tensor, torch.Tensor) – Optimized mark tensor with shape (C + 1, H, W) and loss tensor.

Docs

Access comprehensive developer documentation for TrojanZoo

View Docs