pdf bibtex poster 6 citations long conference paper
Appeared in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)
Abstract:
Verbatim queries submitted to search engines often do not sufficiently describe the user's search intent. Pseudo-relevance feedback (PRF) techniques, which modify a query's representation using top-ranked documents, have been shown to overcome such inadequacies and improve retrieval effectiveness for both lexical methods (e.g., BM25) and dense methods (e.g., ANCE, ColBERT). For instance, the recent ColBERT-PRF approach heuristically chooses new embeddings to add to the query using the inverse document frequency (IDF) of the underlying tokens. However, this technique potentially ignores the valuable context encoded by the embeddings. In this work, we present a contrastive solution that learns to select the most useful embeddings for expansion. More specifically, a deep language model-based contrastive weighting model, called CWPRF, is trained to learn to discriminate between relevant and non-relevant documents for semantic search. Our experimental results show that our contrastive weighting model can aid to select useful expansion embeddings and outperform various baselines. In particular, CWPRF can improve nDCG@10 by upto to 4.1% compared to an existing PRF approach for ColBERT while maintaining its efficiency.
BibTeX @inproceedings{wang:acl2023-wdqe, author = {Wang, Xiao and MacAvaney, Sean and Macdonald, Craig and Ounis, Iadh}, title = {Effective Contrastive Weighting for Dense Query Expansion}, booktitle = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics}, year = {2023}, doi = {10.18653/v1/2023.acl-long.710} }