arXiv Analytics

Sign in

arXiv:cs/0703125 [cs.LG]AbstractReferencesReviewsResources

Intrinsic dimension of a dataset: what properties does one expect?

Vladimir Pestov

Published 2007-03-25Version 1

We propose an axiomatic approach to the concept of an intrinsic dimension of a dataset, based on a viewpoint of geometry of high-dimensional structures. Our first axiom postulates that high values of dimension be indicative of the presence of the curse of dimensionality (in a certain precise mathematical sense). The second axiom requires the dimension to depend smoothly on a distance between datasets (so that the dimension of a dataset and that of an approximating principal manifold would be close to each other). The third axiom is a normalization condition: the dimension of the Euclidean $n$-sphere $\s^n$ is $\Theta(n)$. We give an example of a dimension function satisfying our axioms, even though it is in general computationally unfeasible, and discuss a computationally cheap function satisfying most but not all of our axioms (the ``intrinsic dimensionality'' of Ch\'avez et al.)

Comments: 6 pages, 6 figures, 1 table, latex with IEEE macros, final submission to Proceedings of the 22nd IJCNN (Orlando, FL, August 12-17, 2007)
Journal: Proceedings of the 20th International Joint Conference on Neural Networks (IJCNN'2007), Orlando, Florida (Aug. 12--17, 2007), pp. 1775--1780.
Categories: cs.LG
Related articles: Most relevant | Search more
arXiv:2212.14351 [cs.LG] (Published 2022-12-29)
Properties of Group Fairness Metrics for Rankings
arXiv:2407.13594 [cs.LG] (Published 2024-07-18)
Mechanistically Interpreting a Transformer-based 2-SAT Solver: An Axiomatic Approach
arXiv:2211.05667 [cs.LG] (Published 2022-11-10)
Does the explanation satisfy your needs?: A unified view of properties of explanations