Options
Motion-Based Occlusion-Aware Pixel Graph Network for Video Object Segmentation
Date Issued
01-01-2019
Author(s)
Adak, Saptakatha
Indian Institute of Technology, Madras
Abstract
This paper proposes a dual-channel based Graph Convolutional Network (GCN) for the Video Object Segmentation (VOS) task. The main contribution lies in formulating two pixel graphs based on the raw RGB and optical flow features. Both spatial and temporal features are learned independently, making the network robust to various challenging scenarios in real-world videos. Additionally, a motion orientation-based aggregator scheme efficiently captures long-range dependencies among objects. This not only deals with the complex issue of modelling velocity differences among multiple objects moving in various directions, but also adapts to change of appearance of objects due to pose and scale deformations. Also, an occlusion-aware attention mechanism has been employed to facilitate accurate segmentation under scenarios where multiple objects have temporal discontinuity in their appearance due to occlusion. Performance analysis on DAVIS-2016 and DAVIS-2017 datasets show the effectiveness of our proposed method in foreground segmentation of objects in videos over the existing state-of-the-art techniques. Control experiments using CamVid dataset show the generalising capability of the model for scene segmentation.
Volume
11954 LNCS