This paper has two main contributions, a pyramidal architecture that concentrates the feature map dimension, they change slightly the usual resnet block by adding a zero-padded shortcut, and also try different layer combinations inside the block.


This model is a regular resnet but the number of feature maps in each layer are calculted by a formula.

With these equations, the number of feature maps in the network is really different from a regular resnet as shown in Figure 2.

The new shortcut included in the residual block include a zero padding of the features and can be seen as a new residual path.

Finally, they found empirically some impacts of the relu activation and batch normalization given their position in the network.


They report results on three datasets namely, CIFAR-10, CIFAR-100, and ImageNet.

CIFAR-10 & CIFAR-100