deep-learningCNNGAN

Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

Reviewed on Feb 1, 2018 by Carl Lemaire • https://arxiv.org/abs/1605.09304

Reference : A. Nguyen, A. Dosovitskiy, J. Yosinski, T. Brox, J. Clune. Synthesizing the preferred inputs for neurons in neural networks via deep generator networks. NIPS 2016.

This paper introduces DGN-AM: Deep Generator Network for Activation Maximization.

Training process

The training process involves 4 convolutional networks:

\(E\), a fixed encoder network (the network being visualized)
\(G\), a generator network that should be able to recover the original image from the output of \(E\)
\(C\), a fixed “comparator” network
\(D\), a discriminator

\(G\) is trained to invert a feature representation extracted by \(E\), and has to satisfy 3 objectives:

For a feature vector \(y = E(x)\), the sythesized image \(G(y)\) has to be close to the original image \(x\)
The features of the sythesized image \(C(G(y))\) have to be close to those of the real image \(C(x)\)
\(D\) should be unable to distinguish \(G(y)\) from real images (like a GAN)

Architectures of the networks:

\(E\) is CaffeNet (pretty much AlexNet) truncated at a certain layer
\(C\) is CaffeNet up to layer pool5 (the last pooling layer before the first FC)
\(D\) is a convolutional network with 5 conv + 2 FC
\(G\) is an “upconvolutional” architecture with 9 upconv + 3 FC

Choice of layer for representation

The best layer was determined empirically to be fc6.

Comparison with previous work

Applications

Generate images that maximally activate a class neuron
Generate images that maximally activate a hidden neuron
Watch how features evolve during training
“Produce creative, original art by synthesizing images that activate two neurons at the same time.” (See images in section S8.)