Evaluate state-of-the-art sparse embedding models on the LIMIT dataset (`limit-small` and `limit`) from google's paper `On the Theoretical Limitations of Embedding-Based Retrieval`
☆15Sep 4, 2025Updated 5 months ago
Alternatives and similar repositories for LIMIT-Sparse-Embedding
Users that are interested in LIMIT-Sparse-Embedding are comparing it to the libraries listed below
Sorting:
- ☆14Jul 7, 2024Updated last year
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆38Feb 12, 2026Updated 2 weeks ago
- [SIGIR 2025] The official repo for "Scaling Sparse and Dense Retrieval in Decoder-Only LLMs"☆20Mar 31, 2025Updated 11 months ago
- Code of fine-tuning neural sparse models and training from scratch. #SIGIR2025☆24Feb 6, 2026Updated 3 weeks ago
- Model implementation for the contextual embeddings project☆40Jun 2, 2025Updated 9 months ago
- Hugging Face RoBERTa with Flash Attention 2☆24Sep 14, 2025Updated 5 months ago
- An Open-Source RAG Workload Trace to Optimize RAG Serving Systems☆35Nov 18, 2025Updated 3 months ago
- ☆21Apr 17, 2023Updated 2 years ago
- Make running benchmark simple yet maintainable, again. Now only supports Korean-based cross-encoder.☆29Dec 2, 2025Updated 3 months ago
- ☆24Jan 30, 2025Updated last year
- Compute the exact 100 nearest neighbors for deep1M, deep10M, and deep100M datasets.☆38Jan 25, 2021Updated 5 years ago
- ☆43Apr 22, 2025Updated 10 months ago
- Official software repository of S. Bruch, F. M. Nardini, C. Rulli, and R. Venturini. "Efficient Inverted Indexes for Approximate Retrieva…☆105Jan 27, 2026Updated last month
- fine-tuning tutorial☆18Feb 20, 2026Updated last week
- User-friendly viewer for Parquet files☆10Jan 10, 2026Updated last month
- DOS Program Development☆13Nov 9, 2022Updated 3 years ago
- ☆25Sep 1, 2025Updated 6 months ago
- A passion project on my favorite e-commerce site that scrapes product data and builds a recommendation engine☆10May 2, 2023Updated 2 years ago
- TSDG: An efficient index graph for graph-based nearest neighbor search☆10Jul 14, 2022Updated 3 years ago
- Binaries for mathematicians☆10Mar 19, 2025Updated 11 months ago
- My minimal clean fast Neovim config 💚 ~20 plugins of pure joy.☆28Updated this week
- ☆14Jul 2, 2023Updated 2 years ago
- Finetune mistral-7b-instruct for sentence embeddings☆88May 2, 2024Updated last year
- LightGBM for handling label-imbalanced data with focal and weighted loss functions in binary and multiclass classification☆21Jan 29, 2026Updated last month
- ☆14Dec 12, 2022Updated 3 years ago
- rabitq rust implementation☆10Feb 4, 2026Updated 3 weeks ago
- ☆10Jan 9, 2024Updated 2 years ago
- Redis distributed lock implementation for Python based on Pub/Sub messaging☆11Feb 14, 2026Updated 2 weeks ago
- ☆11Dec 6, 2023Updated 2 years ago
- Modern Methods of Applied Statistics (Spring 2023) STAT 34800☆10May 20, 2023Updated 2 years ago
- Learning materials for the Life In The UK test.☆13Mar 25, 2023Updated 2 years ago
- [LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization☆41Mar 7, 2025Updated 11 months ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆47Jul 25, 2023Updated 2 years ago
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆49Oct 29, 2025Updated 4 months ago
- ☆51Jun 21, 2025Updated 8 months ago
- Korean-MTEB☆74Jan 25, 2026Updated last month
- Materials for the Ultimate Hybrid Search Workshop☆45Dec 13, 2024Updated last year
- Implementation of GraphReader paper: https://arxiv.org/abs/2406.14550☆13Oct 21, 2024Updated last year
- 비즈엠 개발 서버에서 전화번호 인증을 쉽게 할 수 있는 웹사이트입니다.☆10Feb 27, 2023Updated 3 years ago