Episodic CAMN: Contextual Attention-based Memory Networks With Iterative Feedback For Scene Labeling

Summary

The proposed model is a Fully Convolutional Network (FCN) with soft attention on the patch representations (Contextual Attention-based Memory Network, or CAMN). The attention network iteratively refines its output using an RNN, which makes it an Episodic-CAMN.

Basically, the model is VGG + Recurrent soft attention inserted between FC6 and FC7.

Experiments and Results

Datasets:

PASCAL-Context
SIFT Flow
PASCAL VOC 2011

They only compare with VGG-based networks with similar settings.