← smac.pub home

Overcoming Low-Utility Facets for Complex Answer Retrieval

bibtex pdf slides journal article

Authors: Sean MacAvaney, Andrew Yates, Arman Cohan, Luca Soldaini, Kai Hui, Nazli Goharian, Ophir Frieder

Appeared in: Information Retrieval Journal


Many questions cannot be answered simply; their answers must include numerous nuanced details and context. Complex Answer Retrieval (CAR) is the retrieval of answers to such questions. These questions can be constructed from a topic entity (e.g., ‘cheese’) and a facet (e.g., ‘health effects’). While topic matching has been thoroughly explored, we observe that some facets use general language that is unlikely to appear verbatim in answers, exhibiting low utility. In this work, we present an approach to CAR that identifies and addresses low-utility facets. First, we propose two estimators of facet utility: the hierarchical structure of CAR queries, and facet frequency information from training data. Then, to improve the retrieval performance on low-utility headings, we include entity similarity scores using embeddings trained from a CAR knowledge graph, which captures the context of facets. We show that our methods are effective by applying them to two leading neural ranking techniques, and evaluating them on the TREC CAR dataset. We find that our approach perform significantly better than the unmodified neural ranker and other leading CAR techniques, yielding state-of-the-art results. We also provide a detailed analysis of our results, verify that low-utility facets are indeed difficult to match, and that our approach improves the performance for these difficult queries.

BibTex @Article{macavaney:irj2018-car, author = {MacAvaney, Sean and Yates, Andrew and Cohan, Arman and Soldaini, Luca and Hui, Kai and Goharian, Nazli and Frieder, Ophir}, title = {Overcoming Low-Utility Facets for Complex Answer Retrieval}, year = {2018}, journal = {Information Retrieval Journal} }