Please use this identifier to cite or link to this item:
https://open.uns.ac.rs/handle/123456789/10235
Title: | The role of hubness in clustering high-dimensional data | Authors: | Tomašev N. Radovanović M. Mladenić D. Ivanović, Mirjana |
Issue Date: | 8-Jun-2011 | Journal: | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | Abstract: | High-dimensional data arise naturally in many domains, and have regularly presented a great challenge for traditional data-mining techniques, both in terms of effectiveness and efficiency. Clustering becomes difficult due to the increasing sparsity of such data, as well as the increasing difficulty in distinguishing distances between data points. In this paper we take a novel perspective on the problem of clustering high-dimensional data. Instead of attempting to avoid the curse of dimensionality by observing a lower-dimensional feature subspace, we embrace dimensionality by taking advantage of some inherently high-dimensional phenomena. More specifically, we show that hubness, i.e., the tendency of high-dimensional data to contain points (hubs) that frequently occur in k-nearest neighbor lists of other points, can be successfully exploited in clustering. We validate our hypothesis by proposing several hubness-based clustering algorithms and testing them on high-dimensional data. Experimental results demonstrate good performance of our algorithms in multiple settings, particularly in the presence of large quantities of noise. © 2011 Springer-Verlag. | URI: | https://open.uns.ac.rs/handle/123456789/10235 | ISBN: | 9783642208409 | ISSN: | 03029743 | DOI: | 10.1007/978-3-642-20841-6-16 |
Appears in Collections: | PMF Publikacije/Publications |
Show full item record
SCOPUSTM
Citations
36
checked on Aug 26, 2023
Page view(s)
16
Last Week
1
1
Last month
0
0
checked on May 10, 2024
Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.