An Alternative to FLOPS Regularization to Effectively Productionize SPLADE-doc

bibtex short conference paper to appear

Authors: Aldo Porco*, Dhruv Mehra*, Igor Malioutov*, Karthik Radhakrishnan*, Moniba Keymanesh*, Daniel Preotiuc-Pietro, Sean MacAvaney, Pengxiang Cheng

* equal contribution

Appearing in: Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2025)

Links/IDs:

Google Scholar 7wWfoDgAAAAJ:nb7KW1ujOQ8C Enlighten 352748 smac.pub sigir2025-dfflops

Abstract:

The adoption of dense models for document retrieval have been hindered by latency constraints. Learned Sparse Retrieval (LSR) models like SPLADE-doc were designed to mitigate these shortcomings. They encode text representations as a vector of weighted tokens, allowing to leverage the decades-long expertise of inverted indices software such as Solr. SPLADE, the most popular LSR model, uses the FLOPS regularization to favor increasingly sparse representations. Sparser vectors trigger less floating point operations, thus decreasing the scoring computation time. However, in our experiments this loss term does not handle high frequency tokens (like stopwords). Instead, the trained models often use them somewhat similar to dense representation to improve performance. This greatly increases the number of matches, while degrading latency. There are already solutions to this issue (like stopwords removal), but they rely on fully imputing tokens that might be valuable for certain queries. In this paper, we present a new variant of FLOPS regularization based on the token document frequency DF-FLOPS. This new regularization penalizes the usage of high frequency tokens, favoring representations that produce small amounts of matches (and thus scoring) in the collection. DF-FLOPS allows for nuanced usage of high frequency tokens in the few cases where such tokens make an important difference. We evaluate the new method’s efficacy at reducing high-frequency tokens, lowering latency and maintaining performance. We compare DF-FLOPS with FLOPS and other heuristic methods for stopwords removal on SPLADE-doc, and argue its superior performance over all baseline methods under low latency constraints. We show that our method improves latency by 10x in comparison to SPLADE-v2-doc (with only a 2.4 points drop in MRR@10), achieving comparable efficiency to BM25.

BibTeX @inproceedings{porco:sigir2025-dfflops, author = {Porco, Aldo and Mehra, Dhruv and Malioutov, Igor and Radhakrishnan, Karthik and Keymanesh, Moniba and Preotiuc-Pietro, Daniel and MacAvaney, Sean and Cheng, Pengxiang}, title = {An Alternative to FLOPS Regularization to Effectively Productionize SPLADE-doc}, booktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval}, year = {2025} }