arXiv Analytics

Sign in

arXiv:1604.06646 [cs.CV]AbstractReferencesReviewsResources

Synthetic Data for Text Localisation in Natural Images

Ankush Gupta, Andrea Vedaldi, Andrew Zisserman

Published 2016-04-22Version 1

In this paper we introduce a new method for text detection in natural images. The method comprises two contributions: First, a fast and scalable engine to generate synthetic images of text in clutter. This engine overlays synthetic text to existing background images in a natural way, accounting for the local 3D scene geometry. Second, we use the synthetic images to train a Fully-Convolutional Regression Network (FCRN) which efficiently performs text detection and bounding-box regression at all locations and multiple scales in an image. We discuss the relation of FCRN to the recently-introduced YOLO detector, as well as other end-to-end object detection systems based on deep learning. The resulting detection network significantly out performs current methods for text detection in natural images, achieving an F-measure of 84.2% on the standard ICDAR 2013 benchmark. Furthermore, it can process 15 images per second on a GPU.

Related articles: Most relevant | Search more
arXiv:1705.00821 [cs.CV] (Published 2017-05-02)
Statistical learning of rational wavelet transform for natural images
arXiv:1812.07059 [cs.CV] (Published 2018-12-06)
Simultaneous Recognition of Horizontal and Vertical Text in Natural Images
arXiv:1412.6626 [cs.CV] (Published 2014-12-20)
The local low-dimensionality of natural images