☆102Dec 17, 2022Updated 3 years ago
Alternatives and similar repositories for t5x_retrieval
Users that are interested in t5x_retrieval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆82Feb 16, 2022Updated 4 years ago
- Inquisitive Parrots for Search☆200Jun 5, 2025Updated 11 months ago
- Code and data of the EMNLP 2022 Main Conference paper "Reduce Catastrophic Forgetting of Dense Retrieval Training with Teleportation Nega…☆18Mar 25, 2024Updated 2 years ago
- ☆367Apr 12, 2024Updated 2 years ago
- ☆16Jun 14, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- An easy-to-use python toolkit for flexibly adapting various neural ranking models to target domain.☆60May 17, 2023Updated 3 years ago
- Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 20…☆110Apr 18, 2022Updated 4 years ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆97Feb 9, 2023Updated 3 years ago
- [NAACL(2019)] Generating Knowledge Graph Paths from Textual Definitions using Sequence-to-Sequence Models☆11Apr 27, 2022Updated 4 years ago
- [EMNLP 2022] This is the code repo for our EMNLP‘22 paper "COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contr…☆51Oct 12, 2023Updated 2 years ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆16Jan 16, 2024Updated 2 years ago
- Dense hybrid representations for text retrieval☆64Apr 3, 2023Updated 3 years ago
- ☆15Oct 10, 2021Updated 4 years ago
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.☆208Jul 31, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.☆2,198Oct 16, 2025Updated 7 months ago
- ☆39Jul 25, 2024Updated last year
- Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint☆441Mar 26, 2024Updated 2 years ago
- 🤗 Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch.☆17Jun 5, 2025Updated 11 months ago
- ☆14Jul 21, 2022Updated 3 years ago
- [ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning☆99Apr 26, 2023Updated 3 years ago
- decontamination☆33Mar 4, 2026Updated 2 months ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 5 years ago
- Textprep is an analyzing tool for both parallel and non-parallel corpus and its down-stream Natural Language Processing and Machine Trans…☆32Feb 25, 2019Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.☆738May 18, 2026Updated last week
- Code to support the paper "Question and Answer Test-Train Overlap in Open-Domain Question Answering Datasets"☆65Aug 31, 2021Updated 4 years ago
- ☆2,966May 20, 2026Updated last week
- [COLM '24] Source-Aware Training Enables Knowledge Attribution in Language Models☆19Apr 1, 2025Updated last year
- A multilingual version of MS MARCO passage ranking dataset☆148Oct 19, 2023Updated 2 years ago
- Crawler para legislação completa encontrada no site planalto.gov.br☆10Sep 10, 2024Updated last year
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆115Oct 30, 2025Updated 7 months ago
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).☆44Dec 25, 2022Updated 3 years ago
- Un-*** 50 billions multimodality dataset☆24Sep 14, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Binary Passage Retriever (BPR) - an efficient passage retriever for open-domain question answering☆174Jun 6, 2021Updated 4 years ago
- ☆15Jul 9, 2025Updated 10 months ago
- Official repository for "DEnsity: Open-domain Dialogue Evaluation Metric using Density Estimation (ACL2023 Findings)"☆11May 23, 2023Updated 3 years ago
- Code repo for SIGIR 2021 paper "Few-Shot Conversational Dense Retrieval"☆43Dec 9, 2021Updated 4 years ago
- Scalable training for dense retrieval models.☆299May 18, 2026Updated last week
- ☆70Jun 16, 2022Updated 3 years ago
- 🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…☆49Apr 25, 2022Updated 4 years ago