close
test_template

Indexing Multidimensional Data for Nearest Neighbor Queries

Human-Written
download print

About this sample

About this sample

close
Human-Written

Words: 449 |

Page: 1|

3 min read

Updated: 16 November, 2024

Words: 449|Page: 1|3 min read

Updated: 16 November, 2024

Table of contents

  1. Introduction
  2. Indexing Multidimensional Data for Nearest Neighbor Queries
  3. Related Work
  4. Conclusion
  5. References

Introduction

Most previous works in the database literature have focused on indexing lower-dimensional data and on other types of queries besides similarity queries. The lc-d tree was one of the first structures proposed for indexing multidimensional data for nearest neighbor queries. Recently, this structure has been used in geographic information systems for queries like similarity queries and might be useful for similarity indexing. Other methods, such as space-filling curves, linear quad trees, and grid files, do not scale well to high dimensions but may be useful for medium-dimensional data.

Indexing Multidimensional Data for Nearest Neighbor Queries

The R-tree and its most successful variant, the R*-tree, have been used most often for indexing high-dimensional data in the database literature. However, since ranges are stored on each dimension, the index requires more space and time to search in higher dimensionality. For this reason, higher-dimensional data typically is mapped to a lower-dimensional space before indexing in R-trees.

The TV-tree is the only method in the database literature thus far that has been proposed specifically for indexing high-dimensional data. Performance comparisons clearly show that the TV-tree can be much more efficient than the R*-tree. However, the improved performance depends on two assumptions. The first assumption is that dimensions and the feature vectors are ordered by “importance.” This second assumption is that sets of feature vectors in the dataset will tend to exactly match on dimensions, especially on the first few “important” dimensions.

The first assumption is reasonable (if not desirable) since an appropriate transform may be used. The second assumption was not explicitly stated in the paper, but a careful analysis of their algorithms reveals that their performance improvement depends upon it. In some applications, the original feature vectors contain a small set of discrete quantities, so the second assumption does hold. Unfortunately, this second assumption will normally not be true in visual information systems, and in many other applications. Features in these applications are typically real-valued, so that chances of exactly matching on dimensions are negligible. In this case, the TV-tree reduces to an index on only the first few dimensions. Small changes in the proposed algorithms should allow the TV-tree to be a modest improvement over the R*-tree in these applications. However, in this paper, we will refer to the R-tree (and variants) as the best previously known structure for similarity indexing because it has proven itself in more similarity indexing applications (Beckmann et al., 1990; Guttman, 1984).

Related Work

There is also related work outside of the database literature. In the information retrieval literature, work has been done on cluster files that propose structures similar to the SS-tree. In the image database community, a static indexing structure based on Kohonen nets was suggested. There is also related work in the computational geometry and vector quantization literature. These fields offer valuable insights and potential improvements for database indexing, suggesting that interdisciplinary approaches may further enhance our understanding and capabilities in similarity indexing (Kohonen, 1990; Jain & Dubes, 1988).

Get a custom paper now from our expert writers.

Conclusion

The exploration of various indexing methods for multidimensional data highlights the complexity and importance of finding efficient solutions for similarity queries. While methods like the TV-tree show promise, their applicability is limited by certain assumptions. As database demands evolve, continuous research and innovation will be crucial in developing robust and adaptable indexing structures.

References

  • Beckmann, N., Kriegel, H.-P., Schneider, R., & Seeger, B. (1990). The R*-tree: An efficient and robust access method for points and rectangles. Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, 322-331.
  • Guttman, A. (1984). R-trees: A dynamic index structure for spatial searching. Proceedings of the 1984 ACM SIGMOD International Conference on Management of Data, 47-57.
  • Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Prentice-Hall.
  • Kohonen, T. (1990). The self-organizing map. Proceedings of the IEEE, 78(9), 1464-1480.
Image of Alex Wood
This essay was reviewed by
Alex Wood

Cite this Essay

Indexing Multidimensional Data for Nearest Neighbor Queries. (2019, April 10). GradesFixer. Retrieved November 19, 2024, from https://gradesfixer.com/free-essay-examples/indexing-multidimensional-data-for-nearest-neighbor-queries/
“Indexing Multidimensional Data for Nearest Neighbor Queries.” GradesFixer, 10 Apr. 2019, gradesfixer.com/free-essay-examples/indexing-multidimensional-data-for-nearest-neighbor-queries/
Indexing Multidimensional Data for Nearest Neighbor Queries. [online]. Available at: <https://gradesfixer.com/free-essay-examples/indexing-multidimensional-data-for-nearest-neighbor-queries/> [Accessed 19 Nov. 2024].
Indexing Multidimensional Data for Nearest Neighbor Queries [Internet]. GradesFixer. 2019 Apr 10 [cited 2024 Nov 19]. Available from: https://gradesfixer.com/free-essay-examples/indexing-multidimensional-data-for-nearest-neighbor-queries/
copy
Keep in mind: This sample was shared by another student.
  • 450+ experts on 30 subjects ready to help
  • Custom essay delivered in as few as 3 hours
Write my essay

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled

close

Where do you want us to send this sample?

    By clicking “Continue”, you agree to our terms of service and privacy policy.

    close

    Be careful. This essay is not unique

    This essay was donated by a student and is likely to have been used and submitted before

    Download this Sample

    Free samples may contain mistakes and not unique parts

    close

    Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.

    close

    Thanks!

    Please check your inbox.

    We can write you a custom essay that will follow your exact instructions and meet the deadlines. Let's fix your grades together!

    clock-banner-side

    Get Your
    Personalized Essay in 3 Hours or Less!

    exit-popup-close
    We can help you get a better grade and deliver your task on time!
    • Instructions Followed To The Letter
    • Deadlines Met At Every Stage
    • Unique And Plagiarism Free
    Order your paper now