arXiv Analytics

Sign in

arXiv:1708.05368 [physics.soc-ph]AbstractReferencesReviewsResources

Structures of Knowledge from Wikipedia Networks

Maxime Gabella

Published 2017-08-17Version 1

Knowledge is useless without structure. While the classification of knowledge has been an enduring philosophical enterprise, it recently found applications in computer science, notably for artificial intelligence. The availability of large databases allowed for complex ontologies to be built automatically, for example by extracting structured content from Wikipedia. However, this approach is subject to manual categorization decisions made by online editors. Here we show that an implicit classification system emerges spontaneously on Wikipedia. We study the network of first links between articles, and find that it centers on a core cycle involving concepts of fundamental classifying importance. We argue that this structure is rooted in cultural history. For European languages, articles like Philosophy and Science are central, whereas Human and Earth dominate for East Asian languages. This reflects the differences between ancient Greek thought and Chinese tradition. Our results reveal the powerful influence of culture on the intrinsic architecture of complex data sets.

Related articles:
arXiv:1402.5839 [physics.soc-ph] (Published 2014-02-24)
Poisson statistics of PageRank probabilities of Twitter and Wikipedia networks