In this work, an end-to-end trainable convolutional neural network (CNN) is designed, which refers as SegFlow, contains one branch for object segmentation and another one for optical flow. Each branch, was learned the feature representations for each task, where the segmentation branch concentrated on the objects and the optical flow on the motion information.
The features from one branch can asses the other branch by gradient information during back propagation and also they able to communicate with the two closely related objectives.
A pixel-wise cross-entropy loss for segmentation and an endpoint error (EPE) loss as motion at pixels was used.
The lack of a large data set for both segmentation and optical flow tasks, limited training strategy be require one of the ground truths at a time, and was frizzed the weights of other branch, till it was converged then the training was switched to other branch.
For evaluation of optical flow, the average endpoint error from every pixel was computed while Three measures were applied for a segmentation as region similarity J , contour accuracy F and temporal stability T.
Note: Clean version contains images without motion blur and atmospheric effects and a final version is a complicated environment variable.