Options
Top-K query retrieval of combinations with sum-of-subsets ranking
Date Issued
01-01-2014
Author(s)
Majumder, Subhashis
Sanyal, Biswajit
Gupta, Prosenjit
Sinha, Soumik
Pande, Shiladitya
Hon, Wing Kai
Abstract
Top-k query processing is an important building block for ranked retrieval, with applications ranging from text and data integration to distributed aggregation of network logs and sensor data. Top-k queries return a ranked set of the k best data objects selected on the basis of the ranks (scores) of the objects, assigned by any ranking (scoring) function. In this paper, we consider a problem on generation of combinations. Given a set S of n real numbers and an integer r ≤ n, we consider the (Formula Presented) different r-combinations of the elements of S. Let all these (Formula Presented) combinations be indicated by the set (Formula Presented). From this set C, given any positive integer (Formula Presented) our goal is to generate the k best combinations (top-k combinations) ranked on the basis of some aggregation function F. We consider Summation as the aggregation function. This calculates the sum of all the r real numbers in any combination Ci, and the ranking criterion is that a combination Ci is ranked higher than a combination Cj, if the sum of the constituent numbers of Ci is larger than that of Cj. For any given n and r (≤ n), we build a metadata structure G (basically a DAG), even before S and k are known. We can later use G to report the top-k combinations efficiently when S is available. We further present an alternative incremental method, where we generate only the required portions of G on demand, instead of constructing the whole G explicitly. It helps us to save the time and space overhead of the preprocessing phase.
Volume
8881