chentong0/factoid-wiki

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/chentong0/factoid-wiki)

chentong0 / factoid-wiki

Dense X Retrieval: What Retrieval Granularity Should We Use?

☆171

Alternatives and similar repositories for factoid-wiki

Users that are interested in factoid-wiki are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

schen149 / sub-sentence-encoder
View on GitHub
The official code repo for "Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations".
☆85Jan 19, 2024Updated 2 years ago
RulinShao / retrieval-scaling
View on GitHub
Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".
☆226Dec 16, 2025Updated 7 months ago
alibaba / SimCSE-with-CARDS
View on GitHub
Source code for SIGIR 2022 paper.
☆16Apr 25, 2022Updated 4 years ago
DevSinghSachan / art
View on GitHub
Code and models for the paper "Questions Are All You Need to Train a Dense Passage Retriever (TACL 2023)"
☆62Dec 27, 2022Updated 3 years ago
TUPYP7180 / CFT-RAG-2025
View on GitHub
CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter
☆25May 28, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
texttron / hyde
View on GitHub
HyDE: Precise Zero-Shot Dense Retrieval without Relevance Labels
☆583Dec 6, 2024Updated last year
parthsarthi03 / raptor
View on GitHub
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
☆1,726Sep 3, 2024Updated last year
yuzhaouoe / pretraining-data-packing
View on GitHub
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆24Aug 18, 2024Updated last year
castorini / pyserini
View on GitHub
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
☆2,100Updated this week
yunx-z / COMBO
View on GitHub
Merging Generated and Retrieved Knowledge for Open-Domain QA (EMNLP 2023)
☆21Oct 8, 2023Updated 2 years ago
castorini / rank_llm
View on GitHub
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
☆609Updated this week
jzbjyb / FLARE
View on GitHub
Forward-Looking Active REtrieval-augmented generation (FLARE)
☆669Nov 20, 2023Updated 2 years ago
facebookresearch / dpr-scale
View on GitHub
Scalable training for dense retrieval models.
☆298Jul 2, 2026Updated 2 weeks ago
hkust-nlp / SynCSE
View on GitHub
This is the official implementation of the paper: "Contrastive Learning of Sentence Embeddings from Scratch"
☆40Jun 9, 2023Updated 3 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
Ancientshi / ERM4
View on GitHub
Enhancing Retrieval and Managing Retrieval: 4-Module Synergy
☆23Dec 7, 2024Updated last year
Hannibal046 / SelfMemory
View on GitHub
[Neurips2023] Source code for Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory
☆62May 24, 2023Updated 3 years ago
orionw / FollowIR
View on GitHub
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
☆56Jul 3, 2024Updated 2 years ago
princeton-nlp / LitSearch
View on GitHub
[EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search
☆109Dec 2, 2024Updated last year
StonyBrookNLP / ircot
View on GitHub
Repository for Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions, ACL23
☆271Jun 12, 2024Updated 2 years ago
AkariAsai / self-rag
View on GitHub
This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai,…
☆2,410May 25, 2024Updated 2 years ago
wyu97 / GenRead
View on GitHub
Code and Checkpoints for "Generate rather than Retrieve: Large Language Models are Strong Context Generators" in ICLR 2023.
☆293Jan 29, 2023Updated 3 years ago
UKPLab / AdaSent
View on GitHub
This repository contains the code for the EMNLP'23 paper "AdaSent: Efficient Domain-Adapted Sentence Embeddings for Few-Shot Classificati…
☆16Jun 3, 2024Updated 2 years ago
ytyz1307zzh / PLUG
View on GitHub
Code for the ACL 2024 paper "PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning"
☆13Aug 13, 2025Updated 11 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
Tencent / CogKernel
View on GitHub
☆36Feb 21, 2025Updated last year
vidhishanair / FactEdit
View on GitHub
☆14Aug 30, 2023Updated 2 years ago
AnswerDotAI / RAGatouille
View on GitHub
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-…
☆3,939May 17, 2025Updated last year
google-research-datasets / swim-ir
View on GitHub
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…
☆50Nov 13, 2023Updated 2 years ago
kamalkraj / e5-mistral-7b-instruct
View on GitHub
Finetune mistral-7b-instruct for sentence embeddings
☆89May 2, 2024Updated 2 years ago
frankxu2004 / knnlm-why
View on GitHub
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆59Jan 12, 2023Updated 3 years ago
mickymultani / RAG-with-Cross-Encoder-Reranker
View on GitHub
Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.
☆50Jan 12, 2024Updated 2 years ago
thongnt99 / learned-sparse-retrieval
View on GitHub
Unified Learned Sparse Retrieval Framework
☆68May 13, 2024Updated 2 years ago
HITsz-TMG / KaLM-Embedding
View on GitHub
Code for KaLM-Embedding models
☆116Jun 30, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
spcl / MRAG
View on GitHub
Official Implementation of "Multi-Head RAG: Solving Multi-Aspect Problems with LLMs"
☆242Feb 26, 2026Updated 4 months ago
xiaowu0162 / awesome-long-memory
View on GitHub
A collection of long-term memory papers
☆34Jan 18, 2026Updated 6 months ago
PicoCreator / RWKV-LM-LoRA
View on GitHub
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …
☆10Nov 3, 2023Updated 2 years ago
kumar-shridhar / Screws
View on GitHub
SCREWS: A Modular Framework for Reasoning with Revisions
☆27Sep 26, 2023Updated 2 years ago
neulab / knn-transformers
View on GitHub
PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an…
☆287Oct 20, 2022Updated 3 years ago
urchade / GraphER
View on GitHub
GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction
☆98Jul 31, 2024Updated last year
AlexTMallen / adaptive-retrieval
View on GitHub
☆192Jul 2, 2025Updated last year