This paper presents a new way of using convolutions. They test their models for three different tasks namely, classification, semantic segmentation, and spatial transformation.

Rather than using a convolution on only one scale, they use multiple convolutions at each scale of the input. Each of these scales also contains information from the scales close to it.



For the classification, they use CIFAR-100 dataset, for semantic segmentation they build a synthetic dataset based on MNIST, and for the spatial transformation they use the same synthetic dataset.