clean_label¶

class trojanvision.attacks.InvisiblePoison(generator_mode='default', noise_coeff=0.35, train_generator_epochs=800, **kwargs)[source]¶

Invisible Poison Backdoor Attack proposed by Rui Ning from Old Dominion University in INFOCOM 2021.

Based on trojanvision.attacks.CleanLabelBackdoor, InvisiblePoison preprocesses the trigger by a generator (auto-encoder) to amplify its feature activation and make it invisible.

See also

Parameters:

generator_mode (str) – Choose from ['default', 'resnet_comp', 'resnet'] Defaults to 'default'.
noise_coeff (float) – Minify rate of adversarial features. Defaults to 0.35.
train_generator_epochs (int) – Epochs of training generator (auto-encoder). Defaults to 10.

class trojanvision.attacks.Refool(candidate_num=100, rank_iter=16, refool_epochs=5, refool_lr=1e-3, refool_sample_percent=0.1, voc_root=None, efficient=False, **kwargs)[source]¶

Reflection Backdoor Attack (Refool) proposed by Yunfei Liu from Beihang University in ECCV 2020.

It inherits trojanvision.attacks.CleanLabelBackdoor.

Note

Trigger size must be the same as image size.
Currently, mark_alpha is forced to be -1.0, which means to use mean of image and mark to blend them. It should be possible to set a manual mark_alpha instead.

The attack has 3 procedures:

Generate candidate_num reflect images from another public dataset (e.g., Pascal VOC) as trigger candidates.

Select a reflect class (e.g., 'cat') and a background class (e.g., 'person')

Find all images of those 2 classes that don’t have the object of the other class in them.

For image pairs from 2 classes, process and blend them using 'ghost effect' or 'focal blur'.

Calculate difference between blended image and reflect image.

Calculate structure similarity (SSIM) between blended image and background image by calling skimage.metrics.structural_similarity.

If the difference is relatively large enough, blended image is not very dark and SSIM is around (0.7, 0.85), current reflect image is added to candidates.

Rank candidate triggers by conducting tentative attack with multiple triggers injected together.

(Initialize, not repeated) Assign all candidate triggers with same sampling weights.

Sample certain amount (e.g., 40% in original code) of clean data from training set in target class.

Randomly attach a candidate trigger on each clean input according to their sampling weights.

Use the infected data as poison dataset to retrain a pretrained model with refool_epochs and refool_lr.

Evaluate attack succ rate of each used trigger as their new sampling weights.

Set sampling weights of all unused triggers to the median of used ones.

Reset the model as pretrained state.

Repeat the ranking process for rank_iter times.

Use the trigger with largest sampling weight for final attack (with 'dataset' train_mode).

See also

Note

There are differences between our implementation and original codes. I’ve consulted first author to clarify that current implementation of TrojanZoo should work.

Author’s code allows repeat during generating candidate reflect images.

Our code has NO repeat.
Author’s code generates 160 (actually usually not reaching this number) candidate reflect images but requires 200 during attack, which causes more repeat.

Our code generate candidate_num (100 as default) unique candidates.
Author’s code uses a very large refool_epochs (600), which causes too much clean accuracy drop and is very slow.

Our code uses 5 as default.
Author’s code uses a very large refool_sample_percent (0.4), which causes too much clean accuracy drop.

Our code uses 0.1 as default.
There should be a pretrained model that is reset at every ranking loop.

However, the paper and original code don’t mention that.

The author tells me that they load pretrained model from ImageNet.
There is no attack code provided by original author after ranking candidate reflect images.

There is also a conflict between codes and paper from original author.

Paper claims to use top-candidate_num selection at every ranking loop in Algorithm 1.

Author’s code uses random sampling according to W as sampling weights.

Our code follows author’s code.

Parameters:

candidate_num (int) – Number of candidate reflect images. Defaults to 100.
rank_iter (int) – Iteration to update sampling weights of candidate reflect images. Defaults to 16.
refool_epochs (int) – Retraining epochs during trigger ranking. Defaults to 5.
refool_lr (float) – Retraining learning rate during trigger ranking. Defaults to 1e-3.
refool_sample_percent (float) – Percentage of retraining samples by training set in target class during trigger ranking. Defaults to 0.1.
voc_root (str) – Path to Pascal VOC dataset. Defaults to '{data_dir}/image/voc'.
efficient (bool) – Whether to only use a subset (20%) to evaluate ASR during trigger ranking. Defaults to False.

Variables:

reflect_imgs (torch.Tensor) – Candidate reflect images with shape (candidate_num, C, H, W).
train_mode (str) – Training mode to inject backdoor. Forced to be ‘dataset’. See detailed description in trojanvision.attacks.BadNet.
poison_set (torch.utils.data.Dataset) – Poison dataset (no clean data). It is None at initialization because the best trigger keeps unknown.
refool_sample_num (int) – Number of retraining samples from training set in target class during trigger ranking. refool_sample_percent * len(target_set)
target_set (torch.utils.data.Dataset) – Training set in target class.

add_mark(x, **kwargs)[source]¶

Add watermark to input tensor by calling trojanvision.attacks.BadNet.add_mark().

If mark_alpha <0, use mean of x and self.mark.mark as their weights.

clean_label¶

Docs