{ "id": "1802.03936", "version": "v1", "published": "2018-02-12T08:49:18.000Z", "updated": "2018-02-12T08:49:18.000Z", "title": "On the Needs for Rotations in Hypercubic Quantization Hashing", "authors": [ "Anne Morvan", "Antoine Souloumiac", "Krzysztof Choromanski", "Cédric Gouy-Pailler", "Jamal Atif" ], "categories": [ "cs.LG" ], "abstract": "The aim of this paper is to endow the well-known family of hypercubic quantization hashing methods with theoretical guarantees. In hypercubic quantization, applying a suitable (random or learned) rotation after dimensionality reduction has been experimentally shown to improve the results accuracy in the nearest neighbors search problem. We prove in this paper that the use of these rotations is optimal under some mild assumptions: getting optimal binary sketches is equivalent to applying a rotation uniformizing the diagonal of the covariance matrix between data points. Moreover, for two closed points, the probability to have dissimilar binary sketches is upper bounded by a factor of the initial distance between the data points. Relaxing these assumptions, we obtain a general concentration result for random matrices. We also provide some experiments illustrating these theoretical points and compare a set of algorithms in both the batch and online settings.", "revisions": [ { "version": "v1", "updated": "2018-02-12T08:49:18.000Z" } ], "analyses": { "keywords": [ "data points", "nearest neighbors search problem", "hypercubic quantization hashing methods", "general concentration result", "getting optimal binary sketches" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }