FCNN: Fully Convolutional Networks for Semantic Segmentation
One of the first and most successful CNN-based segmentation paper. The proposed model is a simple sequence of convolution and maxpooling layers but with one (or more) up-sampling layers at the end. The reasons for these up-sampling layers is to force the output to have the same size than the input and thus to have a segmentation map with the right size.
They also propose to use skip connections (similar to those in the U-Net) which allows to have a more progressive transition from coarse features to the end segmentation map.