Effective Contrastive Weighting for Dense Query Expansion

pdf bibtex poster 8 citations long conference paper

Authors: Xiao Wang, Sean MacAvaney, Craig Macdonald, Iadh Ounis

Appeared in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)

Links/IDs:

DOI 10.18653/v1/2023.acl-long.710 DBLP conf/acl/WangMMO23 ACL 2023.acl-long.710 Google Scholar 7wWfoDgAAAAJ:RGFaLdJalmkC Semantic Scholar 24fcd29e28a27efb06647ae42a64e50c11d49840 Enlighten 297826 smac.pub acl2023-wdqe

Abstract:

Verbatim queries submitted to search engines often do not sufficiently describe the user's search intent. Pseudo-relevance feedback (PRF) techniques, which modify a query's representation using top-ranked documents, have been shown to overcome such inadequacies and improve retrieval effectiveness for both lexical methods (e.g., BM25) and dense methods (e.g., ANCE, ColBERT). For instance, the recent ColBERT-PRF approach heuristically chooses new embeddings to add to the query using the inverse document frequency (IDF) of the underlying tokens. However, this technique potentially ignores the valuable context encoded by the embeddings. In this work, we present a contrastive solution that learns to select the most useful embeddings for expansion. More specifically, a deep language model-based contrastive weighting model, called CWPRF, is trained to learn to discriminate between relevant and non-relevant documents for semantic search. Our experimental results show that our contrastive weighting model can aid to select useful expansion embeddings and outperform various baselines. In particular, CWPRF can improve nDCG@10 by upto to 4.1% compared to an existing PRF approach for ColBERT while maintaining its efficiency.

BibTeX @inproceedings{wang:acl2023-wdqe, author = {Wang, Xiao and MacAvaney, Sean and Macdonald, Craig and Ounis, Iadh}, title = {Effective Contrastive Weighting for Dense Query Expansion}, booktitle = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics}, year = {2023}, doi = {10.18653/v1/2023.acl-long.710} }