input_filtering¶

class trojanvision.defenses.Neo(neo_asr_threshold=0.8, neo_kmeans_num=3, neo_sample_num=100, **kwargs)[source]¶

Neo proposed by Sakshi Udeshi from Singapore University of Technology and Design in 2019.

It is a input filtering backdoor defense that inherits trojanvision.defenses.InputFiltering.

The defense procedure is:

For a test input, Neo generates its different variants with a random region masked by the input’s dominant color using sklearn.cluster.KMeans.
For each variant, if its classification is different, check if the pixels from masked region is a trigger by evaluating its ASR.
If ASR of any variant exceeds the neo_asr_threshold, the test input is regarded as poisoned.

See also

Note

Neo assumes the defender has the knowledge of the trigger size.

Parameters:

neo_asr_threshold (float) – ASR threshold. Defaults to 0.8.
neo_kmeans_num (int) – Number of KMean clusters. Defaults to 3.
neo_sample_num (int) – Number of sampled masked regions. Defaults to 100.

Variables:

mark_size (tuple[int, int]) – Watermark size (h, w) of self.attack.mark.

get_cls_diff()[source]¶

Get classification difference between original inputs and trigger inputs.

get_dominant_color(img)[source]¶

Get dominant color for one image tensor using sklearn.cluster.KMeans.

Parameters:: img (torch.Tensor) – Image tensor with shape (C, H, W).
Returns:: torch.Tensor – Dominant color tensor with shape (C).

get_pred_label(img, logger=None)[source]¶

Get the prediction label of one certain image (poisoned or not).

Parameters:

img (torch.Tensor) – Image tensor (on GPU) with shape (C, H, W).
logger (trojanzoo.utils.logger.MetricLogger) – output logger. Defaults to None.

Returns:

bool – Whether the image tensor img is poisoned.

class trojanvision.defenses.Strip(strip_fpr=0.05, strip_alpha=0.5, strip_sample_num=64, **kwargs)[source]¶

get_pred_labels()[source]¶

Get predicted labels for test inputs.

Returns:: torch.Tensor – torch.BoolTensor with shape (2 * defense_input_num).

Docs