Efficient Similarity Search over Vector Sets

Report ID: 
Hailing Yu, Wei Niu, Divyakant Agrawal, Amr El Abbadi, and Ambuj K Singh
2004-05-01 05:00:00


Similarity search in applications such as multimedia databases has been gaining a lot of attention in recent years. Due to the curse of dimensionality, it is hardto improve the query cost of similarity search in a high dimensional space. The problem is getting even worse when objects are represented using sets of local feature vectors, since the distance measurement among vector sets is the minimum matching distance. In the minimum matching distance, the vectors from two sets are paired to minimize the sum of the distance of all pairs while the distance measurements over single feature vectors do not need the process of pairing. In this paper, we extend the minimum matching distance to the Euclidean minimum matching distance and the Manhattan minimum matching distance since Euclidean and Manhattan distances are commonly used in similarity search over single feature vectors. Then we propose a novel filtering technique to reduce the query cost of similarity search on both Euclidean and Manhattan minimum matching distances. The experimental results show that our technique reduces the query cost significantly.