arXiv Analytics

Sign in

arXiv:2212.07699 [cs.CL]AbstractReferencesReviewsResources

Retrieval-based Disentanglement with Distant Supervision

Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Xin Jiang, Qun Liu, Lei Chen

Published 2022-12-15Version 1

Disentangled representation learning remains challenging as ground truth factors of variation do not naturally exist. To address this, we present Vocabulary Disentanglement Retrieval~(VDR), a simple yet effective retrieval-based disentanglement framework that leverages nature language as distant supervision. Our approach is built upon the widely-used bi-encoder architecture with disentanglement heads and is trained on data-text pairs that are readily available on the web or in existing datasets. This makes our approach task- and modality-agnostic with potential for a wide range of downstream applications. We conduct experiments on 16 datasets in both text-to-text and cross-modal scenarios and evaluate VDR in a zero-shot setting. With the incorporation of disentanglement heads and a minor increase in parameters, VDR achieves significant improvements over the base retriever it is built upon, with a 9% higher on NDCG@10 scores in zero-shot text-to-text retrieval and an average of 13% higher recall in cross-modal retrieval. In comparison to other baselines, VDR outperforms them in most tasks, while also improving explainability and efficiency.

Related articles: Most relevant | Search more
arXiv:2109.04912 [cs.CL] (Published 2021-09-10)
ReasonBERT: Pre-trained to Reason with Distant Supervision
arXiv:2205.08770 [cs.CL] (Published 2022-05-18)
Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision
arXiv:1505.03823 [cs.CL] (Published 2015-05-14)
Distant Supervision for Entity Linking