Muennighoff / sgptLinks

SGPT: GPT Sentence Embeddings for Semantic Search

☆868

Alternatives and similar repositories for sgpt

Users that are interested in sgpt are comparing it to the libraries listed below

Sorting:

facebookresearch / contriever
Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
☆748Updated 2 years ago
naver / splade
SPLADE: sparse neural search (SIGIR21, SIGIR22)
☆874Updated last year
SeanLee97 / AnglE
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
☆549Updated 4 months ago
primeqa / primeqa
The prime repository for state-of-the-art Multilingual Question Answering research and development.
☆736Updated 6 months ago
facebookresearch / atlas
Code repository for supporting the paper "Atlas Few-shot Learning with Retrieval Augmented Language Models",(https//arxiv.org/abs/2208.03…
☆545Updated last year
sunnweiwei / RankGPT
Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]
☆622Updated last year
xlang-ai / instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
☆1,988Updated 6 months ago
beir-cellar / beir
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
☆1,877Updated last month
texttron / hyde
HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels
☆539Updated 7 months ago
google-research / FLAN
☆1,529Updated 2 weeks ago
UKPLab / gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: …
☆337Updated 2 years ago
ChenghaoMou / text-dedup
All-in-one text de-duplication
☆702Updated last month
castorini / pyserini
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆1,894Updated this week
google-research / deduplicate-text-datasets
☆1,229Updated 11 months ago
castorini / rank_llm
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
☆500Updated this week
dorianbrown / rank_bm25
A Collection of BM25 Algorithms in Python
☆1,206Updated 9 months ago
universal-ner / universal-ner
☆366Updated last year
allenai / natural-instructions
Expanding natural instructions
☆1,008Updated last year
abertsch72 / unlimiformer
Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
☆1,062Updated last year
bigscience-workshop / xmtf
Crosslingual Generalization through Multitask Finetuning
☆537Updated 10 months ago
zeno-ml / zeno-build
Build, evaluate, understand, and fix LLM-based apps
☆489Updated last year
ict-bigdatalab / awesome-pretrained-models-for-information-retrieval
A curated list of awesome papers related to pre-trained models for information retrieval (a.k.a., pretraining for IR).
☆670Updated last year
jzbjyb / FLARE
Forward-Looking Active REtrieval-augmented generation (FLARE)
☆643Updated last year
HazyResearch / evaporate
This repo contains data and code for the paper "Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Da…
☆488Updated last year
lucidrains / RETRO-pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
☆869Updated last year
yuchenlin / LLM-Blender
[ACL2023] We introduce LLM-Blender, an innovative ensembling framework to attain consistently superior performance by leveraging the dive…
☆952Updated 9 months ago
huggingface / text-clustering
Easily embed, cluster and semantically label text datasets
☆557Updated last year
srush / MiniChain
A tiny library for coding with large language models.
☆1,234Updated last year
texttron / tevatron
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
☆665Updated last week
xhluca / bm25s
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
☆1,250Updated last month