← smac.pub home

Effective Adhoc Retrieval through Traversal of a Query-Document Graph

link bibtex long conference paper

Authors: Erlend Frayling, Sean MacAvaney, Craig Macdonald, Iadh Ounis

Appeared in: Proceedings of the 46th European Conference on Information Retrieval Research (ECIR 2024)

Links/IDs:
DOI 10.1007/978-3-031-56063-7_6 DBLP conf/ecir/FraylingMMO24 Google Scholar 7wWfoDgAAAAJ:g5m5HwL7SMYC Enlighten 312855 smac.pub ecir2024-reverted

Abstract:

Adhoc retrieval is the task of effectively retrieving information for an end-user's information need, usually expressed as a textual query. One of the most well-established retrieval frameworks is the two-stage retrieval pipeline, whereby an inexpensive retrieval algorithm retrieves a subset of candidate documents from a corpus, and a more sophisticated (but costly) model re-ranks these candidates. A notable limitation of this two-stage framework is that the second stage re-ranking model can only re-order documents, and any relevant documents not retrieved from the corpus in the first stage are entirely lost to the second stage. A recently-proposed Adaptive Re-Ranking technique has shown that extending the candidate pool by traversing a document similarity graph can overcome this recall problem. However, this traversal technique is agnostic of the user's query, which has the potential to waste compute resources by scoring documents that are not related to the query. In this work, we propose an alternative formulation of the document similarity graph. Rather than using document similarities, we propose a weighted bipartite graph that consists of both document nodes and query nodes. This overcomes the limitations of prior Adaptive Re-Ranking approaches because the bipartite graph can be navigated in a manner that explicitly acknowledges the original user query issued to the search pipeline. We evaluate the effectiveness of our proposed framework by experimenting with the TREC Deep Learning track in a standard adhoc retrieval setting. We find that our approach outperforms state-of-the-art two-stage re-ranking pipelines, improving the nDCG@10 metric by 5.8% on the DL19 test collection.

BibTeX @inproceedings{frayling:ecir2024-reverted, author = {Frayling, Erlend and MacAvaney, Sean and Macdonald, Craig and Ounis, Iadh}, title = {Effective Adhoc Retrieval through Traversal of a Query-Document Graph}, booktitle = {Proceedings of the 46th European Conference on Information Retrieval Research}, year = {2024}, doi = {10.1007/978-3-031-56063-7_6} }