arXiv Analytics

Sign in

arXiv:1902.03227 [cs.CV]AbstractReferencesReviewsResources

Minimal Images in Deep Neural Networks: Fragile Object Recognition in Natural Images

Sanjana Srivastava, Guy Ben-Yosef, Xavier Boix

Published 2019-02-08Version 1

The human ability to recognize objects is impaired when the object is not shown in full. "Minimal images" are the smallest regions of an image that remain recognizable for humans. Ullman et al. 2016 show that a slight modification of the location and size of the visible region of the minimal image produces a sharp drop in human recognition accuracy. In this paper, we demonstrate that such drops in accuracy due to changes of the visible region are a common phenomenon between humans and existing state-of-the-art deep neural networks (DNNs), and are much more prominent in DNNs. We found many cases where DNNs classified one region correctly and the other incorrectly, though they only differed by one row or column of pixels, and were often bigger than the average human minimal image size. We show that this phenomenon is independent from previous works that have reported lack of invariance to minor modifications in object location in DNNs. Our results thus reveal a new failure mode of DNNs that also affects humans to a much lesser degree. They expose how fragile DNN recognition ability is for natural images even without adversarial patterns being introduced. Bringing the robustness of DNNs in natural images to the human level remains an open challenge for the community.

Comments: International Conference on Learning Representations (ICLR) 2019
Categories: cs.CV, eess.IV
Related articles: Most relevant | Search more
arXiv:1805.08174 [cs.CV] (Published 2018-05-21)
Reproducibility Report for "Learning To Count Objects In Natural Images For Visual Question Answering"
arXiv:1904.12690 [cs.CV] (Published 2019-04-26)
Capturing human categorization of natural images at scale by combining deep networks and cognitive models
arXiv:1812.07059 [cs.CV] (Published 2018-12-06)
Simultaneous Recognition of Horizontal and Vertical Text in Natural Images