dorianbrown / rank_bm25Links

A Collection of BM25 Algorithms in Python

☆1,214

Alternatives and similar repositories for rank_bm25

Users that are interested in rank_bm25 are comparing it to the libraries listed below

Sorting:

castorini / pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆1,896Updated this week
beir-cellar / beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
☆1,901Updated last month
naver / splade
SPLADE: sparse neural search (SIGIR21, SIGIR22)
☆880Updated last year
facebookresearch / DPR
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
☆1,825Updated 2 years ago
Muennighoff / sgpt
SGPT: GPT Sentence Embeddings for Semantic Search
☆869Updated last year
facebookresearch / GENRE
Autoregressive Entity Retrieval
☆793Updated 2 years ago
texttron / tevatron
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
☆675Updated this week
xhluca / bm25s
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
☆1,258Updated 2 months ago
terrier-org / pyterrier
A Python framework for performing information retrieval experiments, building on http://terrier.org/
☆468Updated this week
castorini / docTTTTTquery
docTTTTTquery document expansion model
☆368Updated 2 years ago
castorini / anserini
Anserini is a Lucene toolkit for reproducible information retrieval research
☆1,065Updated this week
facebookresearch / contriever
Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
☆749Updated 2 years ago
facebookresearch / cc_net
Tools to download and cleanup Common Crawl data
☆1,020Updated 2 years ago
microsoft / MSMARCO-Passage-Ranking
MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, …
☆328Updated 2 years ago
luyug / Reranker
Build Text Rerankers with Deep Language Models
☆262Updated last year
allenai / ir_datasets
Provides a common interface to many IR ranking datasets.
☆364Updated last month
ict-bigdatalab / awesome-pretrained-models-for-information-retrieval
A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).
☆669Updated last year
Tiiiger / bert_score
BERT score for text generation
☆1,781Updated last year
embeddings-benchmark / mteb
MTEB: Massive Text Embedding Benchmark
☆2,741Updated this week
caiyinqiong / Semantic-Retrieval-Models
A curated list of awesome papers for Semantic Retrieval (TOIS Accepted: Semantic Models for the First-stage Retrieval: A Comprehensive Re…
☆326Updated 2 years ago
AmenRa / ranx
⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍
☆579Updated last week
google-research / deduplicate-text-datasets
☆1,230Updated last year
chakki-works / seqeval
A Python framework for sequence labeling evaluation(named-entity recognition, pos tagging, etc...)
☆1,147Updated 11 months ago
ChenghaoMou / text-dedup
All-in-one text de-duplication
☆704Updated last week
sunnweiwei / RankGPT
Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]
☆625Updated last year
google-research-datasets / natural-questions
Natural Questions (NQ) contains real user questions issued to Google search, and answers found from Wikipedia by annotators. NQ is design…
☆1,023Updated 4 years ago
castorini / pygaggle
a gaggle of deep neural architectures for text ranking and question answering, designed for Pyserini
☆351Updated last year
SeanLee97 / AnglE
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
☆549Updated 4 months ago
cvangysel / pytrec_eval
pytrec_eval is an Information Retrieval evaluation tool for Python, based on the popular trec_eval.
☆325Updated last year
princeton-nlp / DensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.o…
☆604Updated 3 years ago