arXiv Analytics

Sign in

arXiv:1511.04510 [cs.CV]AbstractReferencesReviewsResources

Semantic Object Parsing with Local-Global Long Short-Term Memory

Xiaodan Liang, Xiaohui Shen, Donglai Xiang, Jiashi Feng, Liang Lin, Shuicheng Yan

Published 2015-11-14Version 1

Semantic object parsing is a fundamental task for understanding objects in detail in computer vision community, where incorporating multi-level contextual information is critical for achieving such fine-grained pixel-level recognition. Prior methods often leverage the contextual information through post-processing predicted confidence maps. In this work, we propose a novel deep Local-Global Long Short-Term Memory (LG-LSTM) architecture to seamlessly incorporate short-distance and long-distance spatial dependencies into the feature learning over all pixel positions. In each LG-LSTM layer, local guidance from neighboring positions and global guidance from the whole image are imposed on each position to better exploit complex local and global contextual information. Individual LSTMs for distinct spatial dimensions are also utilized to intrinsically capture various spatial layouts of semantic parts in the images, yielding distinct hidden and memory cells of each position for each dimension. In our parsing approach, several LG-LSTM layers are stacked and appended to the intermediate convolutional layers to directly enhance visual features, allowing network parameters to be learned in an end-to-end way. The long chains of sequential computation by stacked LG-LSTM layers also enable each pixel to sense a much larger region for inference benefiting from the memorization of previous dependencies in all positions along all dimensions. Comprehensive evaluations on three public datasets well demonstrate the significant superiority of our LG-LSTM over other state-of-the-art methods.

Related articles: Most relevant | Search more
arXiv:2211.08542 [cs.CV] (Published 2022-11-12)
CXTrack: Improving 3D Point Cloud Tracking with Contextual Information
arXiv:2204.06371 [cs.CV] (Published 2022-04-13)
Deep learning based automatic detection of offshore oil slicks using SAR data and contextual information
arXiv:2204.05535 [cs.CV] (Published 2022-04-12)
Open-set Text Recognition via Character-Context Decoupling