Appeared in: arXiv
Abstract:
Sparse and dense pseudo-relevance feedback (PRF) approaches perform poorly on challenging queries due to low precision in first-pass retrieval. However, recent advances in neural language models (NLMs) can re-rank relevant documents to top ranks, even when few are in the re-ranking pool. This paper first addresses the problem of poor pseudo-relevance feedback by simply applying re-ranking prior to query expansion and re-executing this query. We find that this change alone can improve the retrieval effectiveness of sparse and dense PRF approaches by 5-8%. Going further, we propose a new expansion model, Latent Entity Expansion (LEE), a fine-grained word and entity-based relevance modelling incorporating localized features. Finally, we include an "adaptive" component to the retrieval process, which iteratively refines the re-ranking pool during scoring using the expansion model, i.e. "re-rank -- expand -- repeat". Using LEE, we achieve (to our knowledge) the best NDCG, MAP and R@1000 results on the TREC Robust 2004 and CODEC adhoc document datasets, demonstrating a significant advancement in expansion effectiveness.
BibTeX @article{mackie:arxiv2023-lee, author = {Mackie, Iain and Chatterjee, Shubham and MacAvaney, Sean and Dalton, Jeff}, title = {Re-Rank - Expand - Repeat: Adaptive Query Expansion for Document Retrieval Using Words and Entities}, year = {2023}, url = {https://arxiv.org/abs/2306.17082}, journal = {arXiv}, volume = {abs/2306.17082} }