Unified Learned Sparse Retrieval Framework
☆68May 13, 2024Updated last year
Alternatives and similar repositories for learned-sparse-retrieval
Users that are interested in learned-sparse-retrieval are comparing it to the libraries listed below
Sorting:
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆47Jul 25, 2023Updated 2 years ago
- ☆39Nov 21, 2022Updated 3 years ago
- Re-Implementation of SPARTA model☆13Oct 1, 2021Updated 4 years ago
- A toolkit for asynchronously validating dense retriever checkpoints during training.☆27Aug 10, 2023Updated 2 years ago
- A toolkit for end-to-end neural ad hoc retrieval☆97Aug 20, 2024Updated last year
- ☆30Sep 25, 2024Updated last year
- Source code of paper 'LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval' (WWW 2023)☆22Aug 28, 2023Updated 2 years ago
- Provides a common interface to many IR ranking datasets.☆381Feb 20, 2026Updated last week
- NAACL2021 - COIL Contextualized Lexical Retriever☆157Jul 27, 2021Updated 4 years ago
- [SIGIR 2025] The official repo for "Scaling Sparse and Dense Retrieval in Decoder-Only LLMs"☆20Mar 31, 2025Updated 11 months ago
- ☆24Jun 28, 2023Updated 2 years ago
- ☆18Aug 21, 2025Updated 6 months ago
- Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.☆730Jan 26, 2026Updated last month
- Training & evaluation library for text-based neural re-ranking and dense retrieval models built with PyTorch☆265Jan 27, 2023Updated 3 years ago
- Code and data of the EMNLP 2022 Main Conference paper "Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Nega…☆18Mar 25, 2024Updated last year
- This is the official code for the EMNLP findings 2025 paper "Enhancing Time Awareness in Generative Recommendation".☆17Aug 30, 2025Updated 6 months ago
- CIKM 2022: CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks☆34Aug 31, 2022Updated 3 years ago
- Inquisitive Parrots for Search☆199Jun 5, 2025Updated 8 months ago
- Submission archive for the MS MARCO passage ranking leaderboard☆13Apr 21, 2023Updated 2 years ago
- ☆47Mar 27, 2022Updated 3 years ago
- Code for COLING22 paper, DPTDR: Deep Prompt Tuning for Dense Passage Retrieval☆26Aug 7, 2023Updated 2 years ago
- Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration☆15Jun 4, 2024Updated last year
- EMNLP 2021 - Pre-training architectures for dense retrieval☆256Mar 18, 2022Updated 3 years ago
- code and data to faciliate BERT/ELECTRA for document ranking. Details refer to the paper - PARADE: Passage Representation Aggregation for…☆96Mar 25, 2023Updated 2 years ago
- A simple toolkit to process TREC files in Python.☆174Aug 24, 2024Updated last year
- SPLADE: sparse neural search (SIGIR21, SIGIR22)☆980May 3, 2024Updated last year
- ☆15Dec 15, 2025Updated 2 months ago
- A library for open domain query facet extraction and generation☆16Apr 24, 2024Updated last year
- SIGIR 2021: Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling☆60Jul 11, 2021Updated 4 years ago
- Cross language information retrieval pipeline☆19Jan 12, 2026Updated last month
- This is the official code for the EMNLP 2023 paper "GLEN: Generative Retrieval via Lexical Index Learning".☆29Aug 25, 2025Updated 6 months ago
- ☆17Mar 30, 2024Updated last year
- 제4회 AI × Bookathon 우수상☆14Jan 20, 2023Updated 3 years ago
- Scalable training for dense retrieval models.☆298Jun 10, 2025Updated 8 months ago
- CIKM'21: JPQ substantially improves the efficiency of Dense Retrieval with 30x compression ratio, 10x CPU speedup and 2x GPU speedup.☆52Feb 19, 2022Updated 4 years ago
- One-stop shop for running and fine-tuning transformer-based language models for retrieval☆63Updated this week
- This is the official implementation of SpaDE. (CIKM'22)☆22Aug 8, 2023Updated 2 years ago
- Code for CEDR: Contextualized Embeddings for Document Ranking, accepted at SIGIR 2019.☆156Nov 6, 2020Updated 5 years ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆80Feb 16, 2022Updated 4 years ago