Dense X Retrieval: What Retrieval Granularity Should We Use?
☆170Jan 8, 2024Updated 2 years ago
Alternatives and similar repositories for factoid-wiki
Users that are interested in factoid-wiki are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official code repo for "Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations".☆85Jan 19, 2024Updated 2 years ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆225Dec 16, 2025Updated 4 months ago
- Source code for SIGIR 2022 paper.☆16Apr 25, 2022Updated 4 years ago
- This repository presents the original implementation of LumberChunker: Long-Form Narrative Document Segmentation by André V. Duarte, João…☆105Feb 9, 2026Updated 3 months ago
- An Open-Source Package for Information Retrieval☆167Apr 27, 2026Updated last week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter☆24May 28, 2025Updated 11 months ago
- Code and models for the paper "Questions Are All You Need to Train a Dense Passage Retriever (TACL 2023)"☆62Dec 27, 2022Updated 3 years ago
- HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels☆578Dec 6, 2024Updated last year
- The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval☆1,658Sep 3, 2024Updated last year
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆23Aug 18, 2024Updated last year
- Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.☆2,051May 1, 2026Updated last week
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆593Updated this week
- Forward-Looking Active REtrieval-augmented generation (FLARE)☆668Nov 20, 2023Updated 2 years ago
- Scalable training for dense retrieval models.☆298Apr 8, 2026Updated last month
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- This is the official implementation of the paper: "Contrastive Learning of Sentence Embeddings from Scratch"☆40Jun 9, 2023Updated 2 years ago
- ☆153Aug 21, 2023Updated 2 years ago
- Merging Generated and Retrieved Knowledge for Open-Domain QA (EMNLP 2023)☆22Oct 8, 2023Updated 2 years ago
- [Neurips2023] Source code for Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory☆62May 24, 2023Updated 2 years ago
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆43Mar 31, 2025Updated last year
- Few-shot Learning with Auxiliary Data☆31Dec 8, 2023Updated 2 years ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆54Jul 3, 2024Updated last year
- This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,…☆2,375May 25, 2024Updated last year
- Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception☆272Sep 25, 2025Updated 7 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆106Dec 2, 2024Updated last year
- Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23☆262Jun 12, 2024Updated last year
- Expand -> Retrieve -> Rerank - simple method with strong results on BRIGHT benchmark☆22Aug 22, 2025Updated 8 months ago
- Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…☆3,918May 17, 2025Updated 11 months ago
- Code and Checkpoints for "Generate rather than Retrieve: Large Language Models are Strong Context Generators" in ICLR 2023.☆293Jan 29, 2023Updated 3 years ago
- This repository contains the code for the EMNLP'23 paper "AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classificati…☆16Jun 3, 2024Updated last year
- ☆14Aug 30, 2023Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆49Nov 13, 2023Updated 2 years ago
- Code for the ACL 2024 paper "PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning"☆14Aug 13, 2025Updated 8 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Finetune mistral-7b-instruct for sentence embeddings☆88May 2, 2024Updated 2 years ago
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- ☆36Feb 21, 2025Updated last year
- BlockRank makes LLMs efficient and scalable for RAG and in-context ranking☆44Dec 12, 2025Updated 4 months ago
- ☆29Apr 8, 2025Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆49Jan 12, 2024Updated 2 years ago
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction☆93Jul 31, 2024Updated last year