facebookresearch/Sphere

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/facebookresearch/Sphere)

facebookresearch / Sphere

Web-scale retrieval for knowledge-intensive NLP

☆553

Alternatives and similar repositories for Sphere

Users that are interested in Sphere are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

facebookresearch / side
View on GitHub
The AI Knowledge Editor
☆184Jul 12, 2022Updated 4 years ago
facebookresearch / KILT
View on GitHub
Library for Knowledge Intensive Language Tasks
☆979Mar 31, 2022Updated 4 years ago
facebookresearch / distributed-faiss
View on GitHub
A library for building and serving multi-node distributed faiss indices.
☆280Nov 1, 2023Updated 2 years ago
castorini / pyserini
View on GitHub
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆2,102Jul 16, 2026Updated last week
facebookresearch / SEAL
View on GitHub
Search Engines with Autoregressive Language models
☆296Apr 4, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
yueyu1030 / COSINE
View on GitHub
[NAACL 2021] This is the code for our paper `Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self…
☆205Aug 17, 2022Updated 3 years ago
beir-cellar / beir
View on GitHub
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
☆2,252Oct 16, 2025Updated 9 months ago
microsoft / fastformers
View on GitHub
FastFormers - highly efficient transformer models for NLU
☆706Mar 21, 2025Updated last year
thakur-nandan / sprint
View on GitHub
SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.
☆48Jul 25, 2023Updated 3 years ago
texttron / tevatron
View on GitHub
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
☆743Jul 18, 2026Updated last week
studio-ousia / bpr
View on GitHub
Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering
☆175Jun 6, 2021Updated 5 years ago
naver / splade
View on GitHub
SPLADE: sparse neural search (SIGIR21, SIGIR22)
☆999May 3, 2024Updated 2 years ago
microsoft / DeBERTa
View on GitHub
The implementation of DeBERTa
☆2,241Sep 29, 2023Updated 2 years ago
facebookresearch / GENRE
View on GitHub
Autoregressive Entity Retrieval
☆800Jul 6, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
UKPLab / gpl
View on GitHub
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …
☆343Jul 6, 2023Updated 3 years ago
Georgetown-IR-Lab / covid-neural-ir
View on GitHub
☆24Oct 23, 2020Updated 5 years ago
princeton-nlp / DensePhrases
View on GitHub
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.o…
☆607Jun 15, 2022Updated 4 years ago
facebookresearch / PAQ
View on GitHub
Code and data to support the paper "PAQ 65 Million Probably-Asked Questions andWhat You Can Do With Them"
☆211Aug 31, 2021Updated 4 years ago
huggingface / setfit
View on GitHub
Efficient few-shot learning with Sentence Transformers
☆2,777May 26, 2026Updated 2 months ago
Muennighoff / sgpt
View on GitHub
SGPT: GPT Sentence Embeddings for Semantic Search
☆872Feb 17, 2024Updated 2 years ago
facebookresearch / dpr-scale
View on GitHub
Scalable training for dense retrieval models.
☆298Jul 2, 2026Updated 3 weeks ago
deepset-ai / FARM
View on GitHub
Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
☆1,752Dec 20, 2023Updated 2 years ago
facebookresearch / NPM
View on GitHub
The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)
☆159Jan 6, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lucidrains / RETRO-pytorch
View on GitHub
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
☆879Oct 30, 2023Updated 2 years ago
castorini / mr.tydi
View on GitHub
Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.
☆83Feb 16, 2022Updated 4 years ago
thakur-nandan / income
View on GitHub
INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.
☆24Sep 24, 2023Updated 2 years ago
infinitylogesh / mutate
View on GitHub
A library to synthesize text datasets using Large Language Models (LLM)
☆152Jan 17, 2023Updated 3 years ago
facebookresearch / tart
View on GitHub
Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.
☆168Oct 4, 2023Updated 2 years ago
allenai / gooaq
View on GitHub
Question-answers, collected from Google
☆133Jul 23, 2021Updated 5 years ago
MaartenGr / PolyFuzz
View on GitHub
Fuzzy string matching, grouping, and evaluation.
☆801Jul 10, 2025Updated last year
princeton-nlp / TRIME
View on GitHub
[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674
☆193Jun 14, 2023Updated 3 years ago
facebookresearch / DPR
View on GitHub
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
☆1,869Apr 6, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
thunlp / OpenMatch
View on GitHub
An Open-Source Package for Information Retrieval.
☆442Oct 7, 2022Updated 3 years ago
facebookresearch / SentAugment
View on GitHub
SentAugment is a data augmentation technique for NLP that retrieves similar sentences from a large bank of sentences. It can be used in c…
☆359Feb 22, 2022Updated 4 years ago
bigscience-workshop / architecture-objective
View on GitHub
☆100Jul 25, 2023Updated 3 years ago
facebookresearch / NeuralDB
View on GitHub
Database Reasoning Over Text project for ACL paper
☆349May 26, 2022Updated 4 years ago
CarperAI / cheese
View on GitHub
Used for adaptive human in the loop evaluation of language and embedding models.
☆306Mar 1, 2023Updated 3 years ago
facebookresearch / cc_net
View on GitHub
Tools to download and cleanup Common Crawl data
☆1,047Apr 25, 2023Updated 3 years ago
ielab / asyncval
View on GitHub
A toolkit for asynchronously validating dense retriever checkpoints during training.
☆27Aug 10, 2023Updated 2 years ago