arXiv Analytics

Sign in

arXiv:2211.03035 [cs.LG]AbstractReferencesReviewsResources

Synthetic Data for Feature Selection

Firuz Kamalov, Hana Sulieman, Aswani Kumar Cherukuri

Published 2022-11-06Version 1

Feature selection is an important and active field of research in machine learning and data science. Our goal in this paper is to propose a collection of synthetic datasets that can be used as a common reference point for feature selection algorithms. Synthetic datasets allow for precise evaluation of selected features and control of the data parameters for comprehensive assessment. The proposed datasets are based on applications from electronics in order to mimic real life scenarios. To illustrate the utility of the proposed data we employ one of the datasets to test several popular feature selection algorithms. The datasets are made publicly available on GitHub and can be used by researchers to evaluate feature selection algorithms.

Related articles: Most relevant | Search more
arXiv:2203.17250 [cs.LG] (Published 2022-03-30)
Generation and Simulation of Synthetic Datasets with Copulas
arXiv:2502.04140 [cs.LG] (Published 2025-02-06)
Synthetic Datasets for Machine Learning on Spatio-Temporal Graphs using PDEs
arXiv:2406.06977 [cs.LG] (Published 2024-06-11)
Cross-domain-aware Worker Selection with Training for Crowdsourced Annotation