The new method adds a reconstruction decoder to the classical encoder-decoder segmentation in order to align source and target encoder features.


The method wants to compensate the domain shift between source and target domain by introducing feature (distribution) similarity metrics. They are introducing a second decoder to the standard encoder-decoder setup which serves to reconstruct the input data which originates from both source and target domain. The complete architecture is trained end-to-end with the following loss function:

Where \(L_r\) is the reconstruction loss function, in their case is a mean-squared error, \(\hat{x}^{s/t}\) are reconstructions of the source/target inputs obtained by the auto-encoding sub-network.

The network is initially trained in an unsupervised fashion, after which the reconstruction decoder is discarded.


  • FT: Finetuning baseline.
  • MMD: Maximum mean discrepancy.
  • Coral: Correlation difference.
  • DANN: Distributions in an adversarial setup.