

class trojanvision.defenses.ActivationClustering(nb_clusters=2, nb_dims=10, reduce_method='FastICA', cluster_analysis='silhouette_score', **kwargs)[source]

Activation Clustering proposed by Bryant Chen from IBM Research in SafeAI@AAAI 2019.

It is a training filtering backdoor defense that inherits trojanvision.defenses.TrainingFiltering.

Activation Clustering assumes in the target class, poisoned samples compose a separate cluster which is small or far from its own class center.

The defense procedure is:

  • Get feature maps for samples

  • For samples from each class

    • Get dim-reduced feature maps for samples using sklearn.decomposition.FastICA or sklearn.decomposition.PCA.

    • Conduct clustering w.r.t. dim-reduced feature maps and get cluster classes for samples.

    • Detect poisoned cluster classes. All samples in that cluster are poisoned. Poisoned samples compose a small separate class.

There are 4 different methods to detect poisoned cluster classes:

  • 'size': The smallest cluster class.

  • 'relative size': The small cluster classes whose proportion is smaller than size_threshold.

  • 'silhouette_score': only detect poison clusters using 'relative_size' when clustering fits data well.

  • 'distance': Poison clusters are far from their own class center,

  • nb_clusters (int) – Number of clusters. Defaults to 2.

  • nb_dims (int) – The reduced dimension of feature maps. Defaults to 10.

  • reduce_method (str) – The method to reduce dimension of feature maps. Defaults to 'FastICA'.

  • cluster_analysis (str) – The method chosen to detect poisoned cluster classes. Choose from ['size', 'relative_size', 'distance', 'silhouette_score'] Defaults to 'silhouette_score'.


Clustering method is sklearn.cluster.KMeans if self.defense_input_num=None (full training set) else sklearn.cluster.MiniBatchKMeans

analyze_by_distance(cluster_class, reduced_fm, reduced_fm_centers, _class, **kwargs)[source]
  • cluster_class (torch.Tensor) – Clustering result tensor with shape (N).

  • reduced_fm (torch.Tensor) – Dim-reduced feature map tensor with shape (N, self.nb_dims)

  • reduced_fm_centers (torch.Tensor) – The centers of dim-reduced feature map tensors in each class with shape (C, self.nb_dims)


list[int] – Predicted poison cluster classes list with shape (K)

analyze_by_relative_size(cluster_class, size_threshold=0.35, **kwargs)[source]

Small clusters whose proportion is smaller than size_threshold.

  • cluster_class (torch.Tensor) – Clustering result tensor with shape (N).

  • size_threshold (float) – Defaults to 0.35.


list[int] – Predicted poison cluster classes list with shape (K)

analyze_by_silhouette_score(cluster_class, reduced_fm, silhouette_threshold=0.1, **kwargs)[source]

Return analyze_by_relative_size() if sklearn.metrics.silhouette_score is high, which means clustering fits data well.


list[int] – Predicted poison cluster classes list with shape (K)

analyze_by_size(cluster_class, **kwargs)[source]

The smallest cluster.


cluster_class (torch.Tensor) – Clustering result tensor with shape (N).


list[int] – Predicted poison cluster classes list with shape (1)


Access comprehensive developer documentation for TrojanZoo

View Docs