arXiv Analytics

Sign in

arXiv:2202.02145 [cs.LG]AbstractReferencesReviewsResources

Generative Modeling of Complex Data

Luca Canale, Nicolas Grislain, Grégoire Lothe, Johan Leduc

Published 2022-02-04Version 1

In recent years, several models have improved the capacity to generate synthetic tabular datasets. However, such models focus on synthesizing simple columnar tables and are not useable on real-life data with complex structures. This paper puts forward a generic framework to synthesize more complex data structures with composite and nested types. It then proposes one practical implementation, built with causal transformers, for struct (mappings of types) and lists (repeated instances of a type). The results on standard benchmark datasets show that such implementation consistently outperforms current state-of-the-art models both in terms of machine learning utility and statistical similarity. Moreover, it shows very strong results on two complex hierarchical datasets with multiple nesting and sparse data, that were previously out of reach.

Related articles: Most relevant | Search more
arXiv:2210.02747 [cs.LG] (Published 2022-10-06)
Flow Matching for Generative Modeling
arXiv:1804.03782 [cs.LG] (Published 2018-04-11)
CoT: Cooperative Training for Generative Modeling
arXiv:2305.11567 [cs.LG] (Published 2023-05-19)
TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series