← smac.pub home

SLEDGE: A Simple Yet Effective Baseline for COVID-19 Scientific Knowledge Search

pdf arxiv bibtex code dblp: journals/corr/abs-2005-02365 non-refereed

See revised version, published in EMNLP 2020 link

Authors: Sean MacAvaney, Arman Cohan, Nazli Goharian

Appeared in: arXiv


With worldwide concerns surrounding the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), there is a rapidly growing body of literature on the virus. Clinicians, researchers, and policy-makers need a way to effectively search these articles. In this work, we present a search system called SLEDGE, which utilizes SciBERT to effectively re-rank articles. We train the model on a general-domain answer ranking dataset, and transfer the relevance signals to SARS-CoV-2 for evaluation. We observe SLEDGE's effectiveness as a strong baseline on the TREC-COVID challenge (topping the learderboard with an [email protected] of 0.6844). Insights provided by a detailed analysis provide some potential future directions to explore, including the importance of filtering by date and the potential of neural methods that rely more heavily on count signals. We release the code to facilitate future work on this critical task at https://github.com/Georgetown-IR-Lab/covid-neural-ir

BibTeX @article{macavaney:arxiv2020-sledge, author = {MacAvaney, Sean and Cohan, Arman and Goharian, Nazli}, title = {SLEDGE: A Simple Yet Effective Baseline for COVID-19 Scientific Knowledge Search}, year = {2020}, url = {https://arxiv.org/abs/2005.02365}, journal = {arXiv}, volume = {abs/2005.02365} }