Top-Down Partitioning for Efficient List-Wise Ranking

Authors: Andrew Parry, Sean MacAvaney, Debasis Ganguly

Appeared in: The Third Workshop on Reaching Efficiency in Neural Information Retrieval (ReNeuIR@SIGIR 2024)

Links/IDs:

arXiv 2405.14589 Google Scholar 7wWfoDgAAAAJ:cFHS6HbyZ2cC Enlighten 327993 smac.pub reneuir2024-pivot

Abstract:

Large Language Models (LLMs) have significantly impacted many facets of natural language processing and information retrieval. Unlike previous encoder-based approaches, the enlarged context window of these generative models allows for ranking multiple documents at once, commonly called list-wise ranking. However, there are still limits to the number of documents that can be ranked in a single inference of the model, leading to the broad adoption of a sliding window approach to identify the k most relevant items in a ranked list. We argue that the sliding window approach is not well-suited for list-wise re-ranking because it (1) cannot be parallelized in its current form, (2) leads to redundant computational steps repeatedly re-scoring the best set of documents as it works its way up the initial ranking, and (3) prioritizes the lowest-ranked documents for scoring rather than the highest-ranked documents by taking a bottom-up approach. Motivated by these shortcomings and an initial study that shows list-wise rankers are biased towards relevant documents at the start of their context window, we propose a novel algorithm that partitions a ranking to depth k and processes documents top-down. Unlike sliding window approaches, our algorithm is inherently parallelizable due to the use of a pivot element, which can be compared to documents down to an arbitrary depth concurrently. In doing so, we reduce the number of expected inference calls by around 33% when ranking at depth 100 while matching the performance of prior approaches across multiple strong re-rankers.

BibTeX @inproceedings{parry:reneuir2024-pivot, author = {Parry, Andrew and MacAvaney, Sean and Ganguly, Debasis}, title = {Top-Down Partitioning for Efficient List-Wise Ranking}, booktitle = {The Third Workshop on Reaching Efficiency in Neural Information Retrieval}, year = {2024}, url = {https://arxiv.org/abs/2405.14589} }