arXiv:1808.10406 Abstract | arXiv Analytics

arXiv:1808.10406 [cs.LG]Abstract References Reviews Resources

Towards Reproducible Empirical Research in Meta-Learning

Adriano Rivolli, Luís P. F. Garcia, Carlos Soares, Joaquin Vanschoren, André C. P. L. F. de Carvalho

Published 2018-08-30Version 1

Meta-learning is increasingly used to support the recommendation of machine learning algorithms and their configurations. Such recommendations are made based on meta-data, consisting of performance evaluations of algorithms on prior datasets, as well as characterizations of these datasets. These characterizations, also called meta-features, describe properties of the data which are predictive for the performance of machine learning algorithms trained on them. Unfortunately, despite being used in a large number of studies, meta-features are not uniformly described and computed, making many empirical studies irreproducible and hard to compare. This paper aims to remedy this by systematizing and standardizing data characterization measures used in meta-learning, and performing an in-depth analysis of their utility. Moreover, it presents MFE, a new tool for extracting meta-features from datasets and identify more subtle reproducibility issues in the literature, proposing guidelines for data characterization that strengthen reproducible empirical research in meta-learning.

Categories: cs.LG, stat.ML

Keywords: meta-learning, machine learning algorithms, meta-features, standardizing data characterization measures, subtle reproducibility issues

Related articles: Most relevant | Search more

arXiv:2007.12475 [cs.LG] (Published 2020-07-12)

Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran

Mostafa Emadi, Ruhollah Taghizadeh-Mehrjardi, Ali Cherati, Majid Danesh, Amir Mosavi, Thomas Scholten

arXiv:1506.00852 [cs.LG] (Published 2015-06-02)

Peer Grading in a Course on Algorithms and Data Structures: Machine Learning Algorithms do not Improve over Simple Baselines

Mehdi S. M. Sajjadi, Morteza Alamgir, Ulrike von Luxburg

arXiv:2008.13690 [cs.LG] (Published 2020-08-31)

Evaluation of machine learning algorithms for Health and Wellness applications: a tutorial

Jussi Tohka, Mark van Gils