arXiv:2103.01828 Abstract | arXiv Analytics

arXiv:2103.01828 [cs.LG]Abstract References Reviews Resources

Factoring out prior knowledge from low-dimensional embeddings

Edith Heiter, Jonas Fischer, Jilles Vreeken

Published 2021-03-02Version 1

Low-dimensional embedding techniques such as tSNE and UMAP allow visualizing high-dimensional data and therewith facilitate the discovery of interesting structure. Although they are widely used, they visualize data as is, rather than in light of the background knowledge we have about the data. What we already know, however, strongly determines what is novel and hence interesting. In this paper we propose two methods for factoring out prior knowledge in the form of distance matrices from low-dimensional embeddings. To factor out prior knowledge from tSNE embeddings, we propose JEDI that adapts the tSNE objective in a principled way using Jensen-Shannon divergence. To factor out prior knowledge from any downstream embedding approach, we propose CONFETTI, in which we directly operate on the input distance matrices. Extensive experiments on both synthetic and real world data show that both methods work well, providing embeddings that exhibit meaningful structure that would otherwise remain hidden.

Comments: 27 pages, 17 figures

Categories: cs.LG, stat.ML

Keywords: prior knowledge, input distance matrices, real world data, visualizing high-dimensional data, remain hidden

Related articles: Most relevant | Search more

arXiv:1901.07469 [cs.LG] (Published 2019-01-09)

Estimating Buildings' Parameters over Time Including Prior Knowledge

Nilavra Pathak, James Foulds, Nirmalya Roy. Nilanjan Banerjee, Ryan Robucci

arXiv:2305.00987 [cs.LG] (Published 2023-05-01)

A novel algorithm can generate data to train machine learning models in conditions of extreme scarcity of real world data

Olivier Niel

arXiv:2211.07549 [cs.LG] (Published 2022-11-14)

Phenotype Detection in Real World Data via Online MixEHR Algorithm

Ying Xu, Anna Decker, Jacob Oppenheim, Romane Gauriau

arXiv Analytics

arXiv:2103.01828 [cs.LG]Abstract References Reviews Resources

Factoring out prior knowledge from low-dimensional embeddings

Links

Toolbox

arXiv:2103.01828 [cs.LG]AbstractReferencesReviewsResources

Factoring out prior knowledge from low-dimensional embeddings

Links

Toolbox

arXiv:2103.01828 [cs.LG]Abstract References Reviews Resources