project-miracl / miraclLinks
A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.
☆188Updated 10 months ago
Alternatives and similar repositories for miracl
Users that are interested in miracl are comparing it to the libraries listed below
Sorting:
- A multilingual version of MS MARCO passage ranking dataset☆145Updated last year
- Scalable training for dense retrieval models.☆298Updated 2 weeks ago
- Inquisitive Parrots for Search☆193Updated 3 weeks ago
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆472Updated last week
- A large-scale information-rich web dataset, featuring millions of real clicked query-document labels☆331Updated 6 months ago
- Build Text Rerankers with Deep Language Models☆263Updated last year
- MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, …☆325Updated 2 years ago
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆162Updated last year
- ☆86Updated 2 months ago
- provides a common interface to many IR measure tools☆85Updated last month
- Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"☆186Updated 6 months ago
- This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"☆205Updated 6 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆130Updated last year
- CLIR version of ColBERT☆68Updated this week
- Benchmarking library for RAG☆210Updated 2 weeks ago
- ToolQA, a new dataset to evaluate the capabilities of LLMs in answering challenging questions with external tools. It offers two levels …☆269Updated last year
- Code for Multilingual Eval of Generative AI paper published at EMNLP 2023☆69Updated last year
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆157Updated last year
- Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"☆101Updated 2 years ago
- Unified Learned Sparse Retrieval Framework☆64Updated last year
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆207Updated last month
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆43Updated last year
- ☆74Updated 6 months ago
- A Python Search Engine for Humans 🥸☆222Updated last year
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆144Updated 6 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆149Updated last year
- ☆100Updated 2 years ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆109Updated last year
- [EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627☆490Updated 8 months ago
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆221Updated 7 months ago