In this paper, they propose to replace the per-pixel cross-entropy loss (CEL) for segmentation to impose a structural regularity. Their method is called adversarial structure matching loss (ASML) which is somehow similar in spirit to ACNN. As shown in fig.2, they encode the segmentation map into a latent vector which they force to be as close as that of the groungtruth.

The loss is thus a combination between an adversarial loss and a regularization loss obtained with a convolutional autoencoder:


Results are better than plain CEL and segmentation GANs.