← smac.pub home

PyTerrier: Declarative Experimentation in Python from BM25 to Dense Retrieval

Nominated for Best Paper link bibtex code poster 80 citations resource conference paper

Authors: Craig Macdonald, Nicola Tonellotto, Sean MacAvaney, Iadh Ounis

Appeared in: 30th ACM International Conference on Information and Knowledge Management (CIKM 2021)

DOI 10.1145/3459637.3482013 DBLP conf/cikm/MacdonaldTMO21 ACM 3459637.3482013 Google Scholar 7wWfoDgAAAAJ:L8Ckcad2t8MC Semantic Scholar 7fa92ed08eee68a945884b8744e7db9887aed9d3 Enlighten 249268 smac.pub cikm2021-pyterrier


PyTerrier is a Python-based retrieval framework for expressing simple and complex information retrieval (IR) pipelines in a declarative manner. While making use of the long-established Terrier IR platform for basic text indexing and retrieval, its salient utility comes from its expressive Python operators, which allow for different IR operations to be combined in different flexible ways. Each operation applies a transformation upon a dataframe, while operators are defined with clear semantics in relational algebra. Going further, we have recently included additional support for BERT-based text re-rankers (such as EPIC) and dense retrieval implementations (such as ANCE and ColBERT). Transformer pipelines can be tuned and evaluated in a declarative manner. To increase the reusability of this framework as a resource for the IR community, PyTerrier provides easy access to a variety of standard benchmark datasets, including pre-built indices. Finally, we highlight the advantages of such a framework for information retrieval researchers and educators.

BibTeX @inproceedings{macdonald:cikm2021-pyterrier, author = {Macdonald, Craig and Tonellotto, Nicola and MacAvaney, Sean and Ounis, Iadh}, title = {PyTerrier: Declarative Experimentation in Python from BM25 to Dense Retrieval}, booktitle = {30th ACM International Conference on Information and Knowledge Management}, year = {2021}, url = {https://dl.acm.org/doi/10.1145/3459637.3482013}, doi = {10.1145/3459637.3482013} }