← smac.pub home

CEDR: Contextualized Embeddings for Document Ranking

bibtex pdf arxiv slides poster doi: 10.1145/3331184.3331317 short conference paper to appear

Authors: Sean MacAvaney, Andrew Yates, Arman Cohan, Nazli Goharian

Appearing in: SIGIR 2019


Although considerable attention has been given to neural ranking architectures recently, far less attention has been paid to the term representations that are used as input to these models. In this work, we investigate how two pretrained contextualized language modes (ELMo and BERT) can be utilized for ad-hoc document ranking. Through experiments on TREC benchmarks, we find that several existing neural ranking architectures can benefit from the additional context provided by contextualized language models. Furthermore, we propose a joint approach that incorporates BERT's classification vector into existing neural models and show that it outperforms state-of-the-art ad-hoc ranking baselines. We also address practical challenges in using these models for ranking, including the maximum input length imposed by BERT and runtime performance impacts of contextualized language models.

BibTeX @InProceedings{macavaney:sigir2019-contextuallms, author = {MacAvaney, Sean and Yates, Andrew and Cohan, Arman and Goharian, Nazli}, title = {CEDR: Contextualized Embeddings for Document Ranking}, booktitle = {SIGIR 2019}, year = {2019}, url = {https://arxiv.org/abs/1904.07094}, doi = {10.1145/3331184.3331317} }