castorini / umbrela
☆36Updated 3 weeks ago
Alternatives and similar repositories for umbrela:
Users that are interested in umbrela are comparing it to the libraries listed below
- ☆45Updated 2 years ago
- One-stop shop for running and fine-tuning transformer-based language models for retrieval☆49Updated this week
- Retrieval-Augmented Generation battle!☆48Updated 2 months ago
- provides a common interface to many IR measure tools☆82Updated 2 weeks ago
- Dense hybrid representations for text retrieval☆62Updated last year
- ☆29Updated last year
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆44Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆18Updated last month
- INCOME: An Easy Repository for Training and Evaluation of Index Compression Methods in Dense Retrieval. Includes BPR and JPQ.☆22Updated last year
- ☆38Updated 2 months ago
- ☆84Updated 6 months ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆43Updated 8 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆46Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆125Updated 11 months ago
- ☆38Updated 10 months ago
- minimal pytorch implementation of bm25 (with sparse tensors)☆97Updated last year
- ☆54Updated 2 years ago
- Unified Learned Sparse Retrieval Framework☆64Updated 10 months ago
- ☆18Updated 7 months ago
- 🌏 Modular retrievers for zero-shot multilingual IR.☆27Updated last year
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆45Updated 8 months ago
- Mr. TyDi is a multi-lingual benchmark dataset built on TyDi, covering eleven typologically diverse languages.☆74Updated 3 years ago
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆30Updated last year
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆43Updated 4 months ago
- ☆37Updated 2 years ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆37Updated last year
- A Workbench for Autograding Retrieve/Generate Systems☆14Updated 4 months ago
- Inquisitive Parrots for Search☆188Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago