{ "id": "2003.03462", "version": "v1", "published": "2020-03-06T23:10:52.000Z", "updated": "2020-03-06T23:10:52.000Z", "title": "BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders", "authors": [ "Kaspar Märtens", "Christopher Yau" ], "comment": "Accepted to AISTATS 2020", "categories": [ "stat.ML", "cs.LG", "q-bio.GN" ], "abstract": "Variational Autoencoders (VAEs) provide a flexible and scalable framework for non-linear dimensionality reduction. However, in application domains such as genomics where data sets are typically tabular and high-dimensional, a black-box approach to dimensionality reduction does not provide sufficient insights. Common data analysis workflows additionally use clustering techniques to identify groups of similar features. This usually leads to a two-stage process, however, it would be desirable to construct a joint modelling framework for simultaneous dimensionality reduction and clustering of features. In this paper, we propose to achieve this through the BasisVAE: a combination of the VAE and a probabilistic clustering prior, which lets us learn a one-hot basis function representation as part of the decoder network. Furthermore, for scenarios where not all features are aligned, we develop an extension to handle translation-invariant basis functions. We show how a collapsed variational inference scheme leads to scalable and efficient inference for BasisVAE, demonstrated on various toy examples as well as on single-cell gene expression data.", "revisions": [ { "version": "v1", "updated": "2020-03-06T23:10:52.000Z" } ], "analyses": { "keywords": [ "variational autoencoders", "translation-invariant feature-level clustering", "dimensionality reduction", "handle translation-invariant basis functions", "single-cell gene expression data" ], "note": { "typesetting": "TeX", "pages": 0, "language": "en", "license": "arXiv", "status": "editable" } } }