Optimal Transport for Deep Joint Transfer Learning

Intro

In this paper, they propose a novel method to jointly fine-tune a Deep Neural Network with source data and target data. By adding an Optimal Transport loss (OT loss) between source and target classifier predictions as a constraint on the source classifier, the proposed Joint Transfer Learning Network (JTLN) can effectively learn useful knowledge for target classification from source data.

The main idea

Given a small target dataset and a large source dataset, they propose to minimize a combination of three losses:

where \(l_{ce}\) stands for cross-entropy loss and \(l_{OT}\) for optimal transfer loss. This is illustrated in Figure 1.

They implemented the usual regularized optimal transfer loss

where \(H(\gamma)\) is the entropy of \(\gamma\) and \(\frac{1}{\lambda}\) is the regularization weight.

Results

They report good results for the tranfer between two aircraft image datasets.

Optimal transportation

For a relatively gentle introduction to optimal transportation, please refer to the following tutorial.