dynamic¶

class trojanvision.attacks.InputAwareDynamic(train_mask_epochs=25, lambda_div=1.0, lambda_norm=100.0, mask_density=0.032, cross_percent=0.1, natural=False, poison_percent=0.1, **kwargs)[source]¶

Input-Aware Dynamic Backdoor Attack proposed by Anh Nguyen and Anh Tran from VinAI Research in NIPS 2020.

Based on trojanvision.attacks.BadNet, InputAwareDynamic trains mark generator and mask generator to synthesize unique watermark for each input.

In classification loss, besides attacking poison inputs and classifying clean inputs, InputAwareDynamic also requires inputs attached with triggers generated from other inputs are still classified correctly (cross-trigger mode).

See also

\begin{aligned} &\textbf{\# train mask generator} \\ &{opt}_{mask} = \text{Adam}(G_{mask}.parameters(), \text{lr}=0.01, \text{betas}=(0.5, 0.9)) \\ &\textbf{for} \: e=1 \: \textbf{to} \: \text{train\_mask\_epochs} \\ &\hspace{5mm}\textbf{for} \: x_1 \: \textbf{in} \: \text{train\_set} \\ &\hspace{10mm}x_2 = \text{sample\_another\_batch}(\text{train\_set}) \\ &\hspace{10mm}\mathcal{L}_{div} = \frac{\lVert x_1 - x_2 \rVert}{\lVert G_{mask}(x_1) - G_{mask}(x_2) \rVert} \\ &\hspace{10mm}\mathcal{L}_{norm} = ReLU(G_{mask}(x_1) - \text{mask\_density}).mean() \\ &\hspace{10mm}\mathcal{L}_{mask} = \lambda_{div} \mathcal{L}_{div} + \lambda_{norm} \mathcal{L}_{norm} \\ &\hspace{10mm}{opt}_{mask}.step() \\ &\rule{110mm}{0.4pt} \\ &\textbf{\# train mark generator and model} \\ &{opt}_{mark} = \text{Adam}(G_{mark}.parameters(), \text{lr}=0.01, \text{betas}=(0.5, 0.9)) \\ &\textbf{for} \: e=1 \: \textbf{to} \: \text{epochs} \\ &\hspace{5mm}\textbf{for} \: (x_1, y_1) \: \textbf{in} \: \text{train\_set} \\ &\hspace{10mm}x_2 = \text{sample\_another\_batch}(\text{train\_set}) \\ &\hspace{10mm}{mark}_{poison}, {mask}_{poison} = G_{mark}, G_{mask} (x_1[:n_{poison}]) \\ &\hspace{10mm}{mark}_{cross}, {mask}_{cross} = G_{mark}, G_{mask} (x_2[n_{poison}: n_{poison} + n_{cross}]) \\ &\hspace{10mm}x_{poison} = {mask}_{poison} \cdot {mark}_{poison} + (1 - {mask}_{poison}) \cdot x_1[:n_{poison}] \\ &\hspace{10mm}x_{cross} = {mask}_{cross} \cdot {mark}_{cross} + (1 - {mask}_{cross}) \cdot x_1[n_{poison}: n_{poison} + n_{cross}] \\ &\hspace{10mm}x = cat([x_{poison}, x_{cross}, x_1[n_{poison}+n_{cross}:]]) \\ &\hspace{10mm}y = cat([y_{poison}, y_1[n_{poison}:]]) \\ &\hspace{10mm}\mathcal{L}_{div} = \frac{\lVert x_{poison} - x_{cross} \rVert}{\lVert {mark}_{poison} - {mark}_{cross} \rVert} \\ &\hspace{10mm}\mathcal{L}_{ce} = cross\_entropy(x, y) \\ &\hspace{10mm}\mathcal{L} = \mathcal{L}_{ce} + \lambda_{div}\mathcal{L}_{div} \\ &\hspace{10mm}{opt}_{mark}.step() \\ &\hspace{10mm}{opt}_{model}.step() \\ \end{aligned}

Parameters:

train_mask_epochs (int) – Epoch to optimize mask generator. Defaults to 25.
lambda_div (float) – Weight of diversity loss during both optimization processes. Defaults to 1.0.
lambda_norm (float) – Weight of norm loss when optimizing mask generator. Defaults to 100.0.
mask_density (float) – Threshold of mask values when optimizing norm loss. Defaults to 0.032.
cross_percent (float) – Percentage of cross inputs in the whole training set. Defaults to 0.1.
poison_percent (float) – Percentage of poison inputs in the whole training set. Defaults to 0.1.
natural (bool) – Whether to use natural backdoors. If True, model parameters will be frozen. Defaults to False.

Variables:

mark_generator (torch.nn.Sequential) – Mark generator instance constructed by define_generator(). Output shape (N, C, H, W).
mask_generator (torch.nn.Sequential) – Mark generator instance constructed by define_generator(). Output shape (N, 1, H, W).

Note

Do NOT directly call self.mark_generator or self.mask_generator. Their raw outputs are not normalized into range [0, 1]. Please call get_mark() and get_mask() instead.

add_mark(x, **kwargs)[source]¶: Add watermark to input tensor by calling get_mark() and get_mask().

static define_generator(num_channels=[32, 64, 128], in_channels=3, out_channels=None)[source]¶

Define a generator used in self.mark_generator and self.mask_generator.

Similar to auto-encoders, the generator is composed of ['down', 'middle', 'up'].

down: $[\text{conv-bn-relu}(c_{i}, c_{i+1}), \text{conv-bn-relu}(c_{i+1}, c_{i+1}), \text{maxpool}(2)]$
middle: $[\text{conv-bn-relu}(c_{-1}, c_{-1})]$
up: $[\text{upsample}(2), \text{conv-bn-relu}(c_{i+1}, c_{i+1}), \text{conv-bn-relu}(c_{i+1}, c_{i})]$

Parameters:

num_channels (list[int]) –
List of intermediate feature numbers. Each element serves as the in_channels of current layer and out_features of preceding layer. Defaults to [32, 64, 128].
- MNIST: [16, 32]
- CIFAR: [32, 64, 128]
in_channels (int) – in_channels of first conv layer in down. It should be image channels. Defaults to 3.
out_channels (int) – out_channels of last conv layer in up. Defaults to None (in_channels).

Returns:

torch.nn.Sequential –

Generator instance with input shape (N, in_channels, H, W): and output shape (N, out_channels, H, W).

get_data(data, org=False, keep_org=True, poison_label=True, **kwargs)[source]¶: Get data.

Note

The difference between this and trojanvision.attacks.BadNet.get_data() is:

This method replaces some clean data with poison version, while BadNet’s keeps the clean data and append poison version.

get_filename(target_class=None, **kwargs)[source]¶: Get filenames for current attack settings.

get_mark(_input)[source]¶: Get mark with shape (N, C, H, W).

$\begin{aligned} &raw = \text{self.mark\_generator(input)} \\ &\textbf{return} \frac{\tanh{(raw)} + 1}{2} \end{aligned}$

get_mask(_input)[source]¶: Get mask with shape (N, 1, H, W).

$\begin{aligned} &raw = \text{self.mask\_generator(input)} \\ &\textbf{return} \frac{\tanh{[10 \cdot \tanh{(raw)}]} + 1}{2} \end{aligned}$

load(filename=None, **kwargs)[source]¶: Load attack results from previously saved files.

save(filename=None, **kwargs)[source]¶: Save attack results to files.

train_mask_generator(verbose=True)[source]¶: Train self.mask_generator.

dynamic¶

Docs