arXiv Analytics

Sign in

arXiv:1709.09490 [cs.CV]AbstractReferencesReviewsResources

Scene Parsing by Weakly Supervised Learning with Image Descriptions

Ruimao Zhang, Liang Lin, Guangrun Wang, Meng Wang, Wangmeng Zuo

Published 2017-09-27Version 1

This paper investigates a fundamental problem of scene understanding: how to parse a scene image into a structured configuration (i.e., a semantic object hierarchy with object interaction relations). We propose a deep architecture consisting of two networks: i) a convolutional neural network (CNN) extracting the image representation for pixel-wise object labeling and ii) a recursive neural network (RsNN) discovering the hierarchical object structure and the inter-object relations. Rather than relying on elaborative annotations (e.g., manually labeled semantic maps and relations), we train our deep model in a weakly-supervised learning manner by leveraging the descriptive sentences of the training images. Specifically, we decompose each sentence into a semantic tree consisting of nouns and verb phrases, and apply these tree structures to discover the configurations of the training images. Once these scene configurations are determined, then the parameters of both the CNN and RsNN are updated accordingly by back propagation. The entire model training is accomplished through an Expectation-Maximization method. Extensive experiments show that our model is capable of producing meaningful and structured scene configurations, and achieving more favorable scene labeling results on PASCAL VOC 2012 and SYSU-Scenes datasets compared to other state-of-the-art weakly-supervised deep learning methods. In particular, SYSU-Scenes is a dedicated dataset released by us to facilitate further research on scene parsing, which contains more than 5000 scene images with their sentence-based semantic descriptions.

Comments: Submitted to TPAMI 2017. arXiv admin note: text overlap with arXiv:1604.02271
Categories: cs.CV
Related articles: Most relevant | Search more
arXiv:1605.02964 [cs.CV] (Published 2016-05-10)
Weakly Supervised Learning of Affordances
arXiv:2311.11772 [cs.CV] (Published 2023-11-20, updated 2023-11-22)
A Good Feature Extractor Is All You Need for Weakly Supervised Learning in Histopathology
arXiv:2310.12677 [cs.CV] (Published 2023-10-19)
Weakly Supervised Learning for Breast Cancer Prediction on Mammograms in Realistic Settings