NAG: Network for Adversary Generation

The proposed method creates a universal adversarial noise generator. The goal is to generate noise that will fool most classifiers.

Fooling loss : $L_{f} = - log(1 - f(x + \delta)_{c})$ where $c = argmax_l f(x)$.
Diversity loss done inside a batch of size $B$ and at a layer $i$ : $L_{d} = - \sum_{n=1}^{B} d(f^i(x_n + \delta_n), f^i(x_n + \delta_{n'}))$ where $n' \neq n$ and is choose randomly. This loss encourage diversity by comparing feature maps inside the classifier.