UKPLab/gpl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UKPLab/gpl)

UKPLab / gpl

Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577

☆343

Alternatives and similar repositories for gpl

Users that are interested in gpl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

JetRunner / LaPraDoR
View on GitHub
🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…
☆49Apr 25, 2022Updated 4 years ago
beir-cellar / beir
View on GitHub
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
☆2,246Oct 16, 2025Updated 9 months ago
castorini / docTTTTTquery
View on GitHub
docTTTTTquery document expansion model
☆377Mar 25, 2023Updated 3 years ago
thakur-nandan / sprint
View on GitHub
SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.
☆48Jul 25, 2023Updated 2 years ago
studio-ousia / bpr
View on GitHub
Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering
☆175Jun 6, 2021Updated 5 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
zetaalphavector / InPars
View on GitHub
Inquisitive Parrots for Search
☆200Jun 5, 2025Updated last year
castorini / mr.tydi
View on GitHub
Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.
☆83Feb 16, 2022Updated 4 years ago
huggingface / setfit
View on GitHub
Efficient few-shot learning with Sentence Transformers
☆2,771May 26, 2026Updated last month
UKP-SQuARE / square-core
View on GitHub
SQuARE: Software for question answering research.
☆75Jun 25, 2024Updated 2 years ago
thakur-nandan / income
View on GitHub
INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.
☆24Sep 24, 2023Updated 2 years ago
sebastian-hofstaetter / tas-balanced-dense-retrieval
View on GitHub
SIGIR 2021: Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling
☆60Jul 11, 2021Updated 5 years ago
sebastian-hofstaetter / neural-ranking-kd
View on GitHub
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation
☆117Jul 11, 2021Updated 5 years ago
Muennighoff / sgpt
View on GitHub
SGPT: GPT Sentence Embeddings for Semantic Search
☆872Feb 17, 2024Updated 2 years ago
sophiaalthammer / parm
View on GitHub
This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…
☆41Jan 5, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
facebookresearch / SEAL
View on GitHub
Search Engines with Autoregressive Language models
☆296Apr 4, 2023Updated 3 years ago
castorini / pyserini
View on GitHub
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆2,100Updated this week
luyug / Condenser
View on GitHub
EMNLP 2021 - Pre-training architectures for dense retrieval
☆256Mar 18, 2022Updated 4 years ago
nreimers / se-pytorch-xla
View on GitHub
☆21Sep 6, 2021Updated 4 years ago
jwieting / paraphrastic-representations-at-scale
View on GitHub
☆74Jul 2, 2021Updated 5 years ago
sebastian-hofstaetter / matchmaker
View on GitHub
Training & evaluation library for text-based neural re-ranking and dense retrieval models built with PyTorch
☆265Jan 27, 2023Updated 3 years ago
facebookresearch / dpr-scale
View on GitHub
Scalable training for dense retrieval models.
☆298Jul 2, 2026Updated 2 weeks ago
fresh-stack / freshstack
View on GitHub
This repository helps you evaluate your models on the FreshStack benchmark!
☆34Dec 9, 2025Updated 7 months ago
castorini / pygaggle
View on GitHub
a gaggle of deep neural architectures for text ranking and question answering, designed for Pyserini
☆354Dec 21, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
staoxiao / RetroMAE
View on GitHub
Codebase for RetroMAE and beyond.
☆275Jun 7, 2024Updated 2 years ago
stanford-futuredata / ColBERT
View on GitHub
ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
☆3,902Oct 14, 2025Updated 9 months ago
iesl / CSFCube
View on GitHub
A Test Collection of Computer Science Papers for Faceted Query by Example
☆23Nov 28, 2021Updated 4 years ago
texttron / tevatron
View on GitHub
Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.
☆742Updated this week
facebookresearch / contriever
View on GitHub
Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning
☆779Apr 7, 2023Updated 3 years ago
voidism / DiffCSE
View on GitHub
Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"
☆297Jul 12, 2026Updated last week
unicamp-dl / mMARCO
View on GitHub
A multilingual version of MS MARCO passage ranking dataset
☆148Oct 19, 2023Updated 2 years ago
naver / splade
View on GitHub
SPLADE: sparse neural search (SIGIR21, SIGIR22)
☆999May 3, 2024Updated 2 years ago
capreolus-ir / capreolus
View on GitHub
A toolkit for end-to-end neural ad hoc retrieval
☆98Aug 20, 2024Updated last year
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
google-research-datasets / swim-ir
View on GitHub
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…
☆50Nov 13, 2023Updated 2 years ago
sebastian-hofstaetter / teaching
View on GitHub
Open-Source Information Retrieval Courses @ TU Wien
☆705Jun 12, 2023Updated 3 years ago
caskcsg / ir
View on GitHub
Collections of IR Research
☆37May 18, 2025Updated last year
huggingface / sentence-transformers
View on GitHub
State-of-the-Art Embeddings, Retrieval, and Reranking
☆18,928Updated this week
allenai / ir_datasets
View on GitHub
Provides a common interface to many IR ranking datasets.
☆390May 28, 2026Updated last month
terrier-org / pyterrier
View on GitHub
A Python framework for performing information retrieval experiments, building on http://terrier.org/
☆510Updated this week
stanfordnlp / ColBERT-QA
View on GitHub
Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)
☆39Aug 2, 2021Updated 4 years ago