microsoft / MS-MARCO-Web-Search
A large-scale information-rich web dataset, featuring millions of real clicked query-document labels
☆319Updated 3 months ago
Alternatives and similar repositories for MS-MARCO-Web-Search:
Users that are interested in MS-MARCO-Web-Search are comparing it to the libraries listed below
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.☆418Updated last week
- Generative Representational Instruction Tuning☆612Updated last week
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.☆180Updated 7 months ago
- Is ChatGPT Good at Search? LLMs as Re-Ranking Agent [EMNLP 2023 Outstanding Paper Award]☆581Updated last year
- ☆502Updated 4 months ago
- This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"☆203Updated 3 months ago
- Scalable training for dense retrieval models.☆284Updated last month
- Inquisitive Parrots for Search☆189Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆126Updated last year
- Late Interaction Models Training & Retrieval☆263Updated this week
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆719Updated last year
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆190Updated 5 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆173Updated 6 months ago
- Tevatron - A flexible toolkit for neural retrieval research and development.☆577Updated this week
- Benchmarking library for RAG☆181Updated last week
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆131Updated 4 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆196Updated this week
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆162Updated last year
- code for training & evaluating Contextual Document Embedding models☆176Updated 2 months ago
- [EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627☆478Updated 5 months ago
- LOFT: A 1 Million+ Token Long-Context Benchmark☆182Updated this week
- Build Text Rerankers with Deep Language Models☆261Updated last year
- Manage scalable open LLM inference endpoints in Slurm clusters☆253Updated 8 months ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆152Updated last year
- Attribute (or cite) statements generated by LLMs back to in-context information.☆219Updated 5 months ago
- awesome synthetic (text) datasets☆265Updated 4 months ago
- [EMNLP 2023] Adapting Language Models to Compress Long Contexts☆296Updated 6 months ago
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆133Updated 3 months ago
- ☆142Updated 11 months ago
- ☆514Updated 7 months ago