☆102Dec 17, 2022Updated 3 years ago
Alternatives and similar repositories for t5x_retrieval
Users that are interested in t5x_retrieval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆80Feb 16, 2022Updated 4 years ago
- Inquisitive Parrots for Search☆200Jun 5, 2025Updated 9 months ago
- Code and data of the EMNLP 2022 Main Conference paper "Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Nega…☆18Mar 25, 2024Updated 2 years ago
- ☆367Apr 12, 2024Updated last year
- ☆16Jun 14, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆29Sep 26, 2022Updated 3 years ago
- An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.☆60May 17, 2023Updated 2 years ago
- Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 20…☆110Apr 18, 2022Updated 3 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆97Feb 9, 2023Updated 3 years ago
- [NAACL(2019)] Generating Knowledge Graph Paths from Textual Definitions using Sequence-to-Sequence Models☆11Apr 27, 2022Updated 3 years ago
- [EMNLP 2022] This is the code repo for our EMNLP‘22 paper "COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contr…☆50Oct 12, 2023Updated 2 years ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆16Jan 16, 2024Updated 2 years ago
- decontamination☆29Mar 4, 2026Updated 3 weeks ago
- Dense hybrid representations for text retrieval☆64Apr 3, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆15Oct 10, 2021Updated 4 years ago
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.☆202Jul 31, 2024Updated last year
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆2,120Oct 16, 2025Updated 5 months ago
- ☆39Jul 25, 2024Updated last year
- Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint☆432Mar 26, 2024Updated 2 years ago
- ☆14Jul 21, 2022Updated 3 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆98Apr 26, 2023Updated 2 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 4 years ago
- Textprep is an analyzing tool for both parallel and non-parallel corpus and its down-stream Natural Language Processing and Machine Trans…☆32Feb 25, 2019Updated 7 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.☆733Jan 26, 2026Updated 2 months ago
- ☆2,956Mar 9, 2026Updated 3 weeks ago
- Code to support the paper "Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets"☆65Aug 31, 2021Updated 4 years ago
- [COLM '24] Source-Aware Training Enables Knowledge Attribution in Language Models☆19Apr 1, 2025Updated 11 months ago
- A multilingual version of MS MARCO passage ranking dataset☆147Oct 19, 2023Updated 2 years ago
- CIKM 2022: Evaluating Interpolation and Extrapolation Performance of Neural Retrieval Models☆10Aug 4, 2022Updated 3 years ago
- Un-*** 50 billions multimodality dataset☆23Sep 14, 2022Updated 3 years ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆115Oct 30, 2025Updated 5 months ago
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).☆44Dec 25, 2022Updated 3 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆175Jun 6, 2021Updated 4 years ago
- Official repository for "DEnsity: Open-domain Dialogue Evaluation Metric using Density Estimation (ACL2023 Findings)"☆11May 23, 2023Updated 2 years ago
- Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"☆42Dec 9, 2021Updated 4 years ago
- Scalable training for dense retrieval models.☆298Jun 10, 2025Updated 9 months ago
- 🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…☆49Apr 25, 2022Updated 3 years ago
- 🌏 Modular retrievers for zero-shot multilingual IR.☆30Mar 6, 2024Updated 2 years ago
- Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.☆2,043Updated this week