arXiv:1312.4986 Abstract | arXiv Analytics

arXiv:1312.4986 [cs.LG]Abstract References Reviews Resources

A Comparative Evaluation of Curriculum Learning with Filtering and Boosting

Published 2013-12-17Version 1

Not all instances in a data set are equally beneficial for inferring a model of the data. Some instances (such as outliers) are detrimental to inferring a model of the data. Several machine learning techniques treat instances in a data set differently during training such as curriculum learning, filtering, and boosting. However, an automated method for determining how beneficial an instance is for inferring a model of the data does not exist. In this paper, we present an automated method that orders the instances in a data set by complexity based on the their likelihood of being misclassified (instance hardness). The underlying assumption of this method is that instances with a high likelihood of being misclassified represent more complex concepts in a data set. Ordering the instances in a data set allows a learning algorithm to focus on the most beneficial instances and ignore the detrimental ones. We compare ordering the instances in a data set in curriculum learning, filtering and boosting. We find that ordering the instances significantly increases classification accuracy and that filtering has the largest impact on classification accuracy. On a set of 52 data sets, ordering the instances increases the average accuracy from 81% to 84%.

Comments: 19 pages, 2 figures, 6 tables

Categories: cs.LG

Keywords: data set, curriculum learning, comparative evaluation, machine learning techniques treat instances, instances significantly increases classification accuracy

Related articles: Most relevant | Search more

arXiv:2101.10427 [cs.LG] (Published 2021-01-25)

Finding hidden-feature depending laws inside a data set and classifying it using Neural Network

Thilo Moshagen, Nihal Acharya Adde, Ajay Navilarekal Rajgopal

arXiv:1706.05123 [cs.LG] (Published 2017-06-16)

Deriving Compact Laws Based on Algebraic Formulation of a Data Set

Wenqing Xu, Mark Stalzer

arXiv:2311.13326 [cs.LG] (Published 2023-11-22)

Curriculum Learning and Imitation Learning for Model-free Control on Financial Time-series