(NiN) Network In Network
Summary
The conventional convolutional layer uses linear filters followed by a nonlinear activation function to scan the input feature maps as shown in Figure 1(a). In this paper, they propose a new type of conv layer by replacing the linear filters+nonlinearity by a micro multilayer perceptron as shown in Figure 1(b). The feature maps are obtained by sliding the micro networks over the input in a similar manner as CNN; they are then fed into the next layer.
A 3-layer NiN network is shown in Figure 2. The last layer is a global average pooling.
The advantages of the NiN are two-fold:
- NiN networks get better accuracy with fewer layers because each layer implements a more complex non-linearity function.
- The last feature maps are easier to visualize and interpret as confidence maps (c.f Figure 4).
Experiments and Results
They show better results on CIFAR10, CIFAR100, MNIST, and SVHN