Pssst… we can write an original essay just for you.
Any subject. Any type of essay.
We’ll even meet a 3-hour deadline.
121 writers online
Most previous works in the database literature has focused on indexing lower dimensional data and on other types of queries besides similarity queries. The lc-d tree was one of the first structures proposed for indexing multidimensional data for nearest neighbor queries. Recently, this structure has been used in geographic information systems for queries like similarity queries, and might be useful for similarity indexing. Other methods such as space filling curves, linear quad trees, and grid files, do not scale well to high dimensions, but may be useful for medium dimensional data.
The R-tree and its most successful variant, the R*-tree, have been used most often for indexing high dimensional data in the database literature. However, since ranges are stored on each dimension, the index requires more space and time to search in higher dimensionality. For this reason, higher dimensional data typically is mapped to a lower dimensional space before indexing in R-trees.
The TV-tree is the only method in the database literature thus far that has been proposed specifically for indexing high-dimensional data. Performance comparisons clearly show that the TV-tree can be much more efficient than the R*-tree. However, the improved performance depends on two assumptions. The first assumption is that dimensions and the feature vectors are ordered by “importance.” This second assumption is that sets of feature vectors in the dataset will tend to exactly match on dimensions, especially on the first few “important” dimensions.
The first assumption is reasonable (if not desirable) since an appropriate transform may be used. The second assumption was not explicitly stated, Ln the paper, but a careful analysis of their algorithms reveals that their performance improvement depends upon it. In some applications, the original feature vectors contain a small set of discrete quantities, so the second assumption does hold.
Unfortunately, this second assumption will normally not be true in visual information systems, and in many other applications. Features in these applications are typically real-valued, so that chances of exactly matching on dimensions are negligible. In this case, the TV-tree reduces to an index on only first few dimensions. Small changes in the proposed algorithms should allow the TV-tree to be a modest improvement over the R*-tree in these applications. However, in this paper, we will refer to the R-tree (and variants) as the best previously known structure for similarity indexing because it has proven itself in more similarity indexing applications.
There is also related work outside of the database literature. In the information retrieval literature, work has been done on cluster fides that proposes structures similar to the SS-tree. In the image database community, a static indexing structure based on Kohonen nets was suggested. There is also related work in the computational geometry and vector quantization literature.
We provide you with original essay samples, perfect formatting and styling
To export a reference to this article please select a referencing style below:
Sorry, copying is not allowed on our website. If you’d like this or any other sample, we’ll happily email it to you.
Attention! this essay is not unique. You can get 100% plagiarism FREE essay in 30sec
Sorry, we cannot unicalize this essay. You can order Unique paper and our professionals Rewrite it for you
Your essay sample has been sent.
Want us to write one just for you? We can custom edit this essay into an original, 100% plagiarism free essay.Order now
Are you interested in getting a customized paper?Check it out!