π Modular retrievers for zero-shot multilingual IR.
β30Mar 6, 2024Updated 2 years ago
Alternatives and similar repositories for xm-retrievers
Users that are interested in xm-retrievers are comparing it to the libraries listed below
Sorting:
- [SIGIR'24] Generative Retrieval as Multi-Vector Dense Retrievalβ36Oct 18, 2024Updated last year
- β15Jun 10, 2024Updated last year
- Starbucks: Improved Training for 2D Matryoshka Embeddingsβ22Jun 30, 2025Updated 8 months ago
- Applying "Load What You Need: Smaller Versions of Multilingual BERT" to LaBSEβ19Sep 22, 2021Updated 4 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β49Nov 13, 2023Updated 2 years ago
- ACL 2023 Dual-Alignment Pre-training for Cross-lingual Sentence Embeddingβ24Aug 21, 2024Updated last year
- β24Jun 28, 2023Updated 2 years ago
- Implementation of "Efficient Multi-vector Dense Retrieval with Bit Vectors", ECIR 2024β69Oct 21, 2025Updated 4 months ago
- Ranking of fine-tuned HF models as base models.β36Sep 17, 2025Updated 5 months ago
- Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044β35Oct 3, 2024Updated last year
- Adding new tasks to T0 without catastrophic forgettingβ33Oct 20, 2022Updated 3 years ago
- β37May 28, 2023Updated 2 years ago
- A Learnable LSH Framework for Efficient NN Trainingβ34Jul 22, 2021Updated 4 years ago
- simd enabled column imprintsβ11Feb 12, 2018Updated 8 years ago
- Fast and memory-efficient Python PDF Parser based on xpdf sourcesβ44Dec 15, 2023Updated 2 years ago
- GraalVM GitHub actionβ13Jun 25, 2022Updated 3 years ago
- The dataset contains over 250k product brand-value annotations with more than 50k unique values across eight main categories of Amazon prβ¦β13Jan 18, 2024Updated 2 years ago
- Applescripts for controlling Spotifyβ23Oct 20, 2016Updated 9 years ago
- β16Dec 6, 2014Updated 11 years ago
- New York Times Scraperβ11Feb 19, 2024Updated 2 years ago
- A simple API that can generate various types of hexagon grids - returns GeoJSON data or load into PostGIS with performant JDBC.β10Aug 2, 2025Updated 7 months ago
- β10May 30, 2018Updated 7 years ago
- A Library for Scaling Mixed-Integer Optimization-Based Machine Learning.β12Jun 24, 2024Updated last year
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddingsβ44Mar 6, 2024Updated 2 years ago
- Tools to cluster visually similar images into groups in an image datasetβ11Jul 29, 2022Updated 3 years ago
- Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorchβ10Aug 7, 2024Updated last year
- β10Oct 2, 2024Updated last year
- Few-shot NLP benchmark for unified, rigorous evalβ93Jul 12, 2022Updated 3 years ago
- Source code for "Revisiting Unsupervised Relation Extraction" in ACL 2020β36Jun 20, 2023Updated 2 years ago
- [ICML 2023] Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learningβ44May 10, 2023Updated 2 years ago
- An approximate implementation of the OpenAI paper - An Empirical Model of Large-Batch Training for MNISTβ11Nov 19, 2022Updated 3 years ago
- SQL Optimizations using MLIRβ12Apr 5, 2020Updated 5 years ago
- AI model designed to test the effectiveness in handling external ethical attacks.β11Feb 9, 2026Updated last month
- β13Jan 21, 2022Updated 4 years ago
- β10Sep 24, 2017Updated 8 years ago
- Use pretrained BERT model to automatically generate grammar multiple choice questions (MCQ) from any news article or story.β13Oct 2, 2019Updated 6 years ago
- Learn to code for NLPβ10Jul 20, 2020Updated 5 years ago
- β13Oct 16, 2025Updated 4 months ago
- Python API for Science Parseβ13Mar 27, 2021Updated 4 years ago