frinkleko/LIMIT-Sparse-Embedding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/frinkleko/LIMIT-Sparse-Embedding)

frinkleko / LIMIT-Sparse-Embedding

Evaluate state-of-the-art sparse embedding models on the LIMIT dataset (`limit-small` and `limit`) from google's paper `On the Theoretical Limitations of Embedding-Based Retrieval`

☆16

Alternatives and similar repositories for LIMIT-Sparse-Embedding

Users that are interested in LIMIT-Sparse-Embedding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fdugzc / opensearch-sparse-model-tuning-sample
View on GitHub
Code of fine-tuning neural sparse models and training from scratch. #SIGIR2025
☆26Mar 11, 2026Updated 4 months ago
opensearch-project / index-management-dashboards-plugin
View on GitHub
🗃 Manage policies and jobs and automate periodic data operations in OpenSearch Dashboards
☆23Updated this week
HansiZeng / scaling-retriever
View on GitHub
[SIGIR 2025] The official repo for "Scaling Sparse and Dense Retrieval in Decoder-Only LLMs"
☆22Mar 31, 2025Updated last year
DSBA-Lab / Contrastive-Accumulation
View on GitHub
☆14Jul 7, 2024Updated 2 years ago
jukofyork / transplant-vocab
View on GitHub
Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.
☆54Oct 29, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
DunZhang / Jasper-Token-Compression-Training
View on GitHub
The training codes of Jasper-Token-Compression-600M
☆20Nov 19, 2025Updated 8 months ago
xinzhel / LLM-Search
View on GitHub
Survey on LLM Inference via Search (TMLR 2025)
☆15May 6, 2025Updated last year
santhoshtr / spellchecker-webservice
View on GitHub
Spellchecker service based on hunspell for 90 languages
☆10Oct 26, 2020Updated 5 years ago
recombee / CompresSAE
View on GitHub
Sparse Embedding Compression for Scalable Retrieval in Recommender Systems
☆39Nov 21, 2025Updated 7 months ago
ARiSE-Lab / CYCLE_OOPSLA_24
View on GitHub
Open-source repository for the OOPSLA'24 paper "CYCLE: Learning to Self-Refine Code Generation"
☆10Mar 8, 2024Updated 2 years ago
instructkr / reranker-simple-benchmark
View on GitHub
Make running benchmark simple yet maintainable, again. Now only supports Korean-based cross-encoder.
☆35Dec 2, 2025Updated 7 months ago
TusKANNy / seismic
View on GitHub
Official repository of the Seismic library.
☆135Jul 6, 2026Updated 2 weeks ago
floatai / HumanEval-XL
View on GitHub
[LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization
☆42Mar 7, 2025Updated last year
taoshen58 / LexMAE
View on GitHub
☆21Apr 17, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
santhoshtr / Manjari
View on GitHub
Manjari Malayalam Font.
☆10Sep 29, 2023Updated 2 years ago
aws-samples / amazon-opensearch-service-monitor
View on GitHub
This repository contains step by step demonstration to setup monitoring Stack for Amazon OpenSearch Service domains across all specified …
☆38Oct 27, 2023Updated 2 years ago
ejaasaari / lemur
View on GitHub
[ICML'26] LEMUR reduces multi-vector retrieval for late interaction models such as ColBERT into regular single-vector retrieval.
☆31Jun 21, 2026Updated last month
rnkn / side-hustle
View on GitHub
Hustle through a buffer's Imenu in a side window in GNU Emacs
☆26Jun 28, 2024Updated 2 years ago
mindset-ai / memex-ai
View on GitHub
Memex AI full source code
☆29Updated this week
hltcoe / rank-k
View on GitHub
Repository for the listwise reranker Rank-K
☆16May 23, 2025Updated last year
gauthamsuresh09 / wav2vec2-large-xlsr-53-malayalam
View on GitHub
Wav2vec2 Large XLSR 53 fine-tuned for Malayalam
☆11Sep 7, 2021Updated 4 years ago
iPieter / llmq
View on GitHub
A Scheduler for Batched LLM Inference
☆19Oct 5, 2025Updated 9 months ago
huangd1999 / EffiLearner
View on GitHub
[NeurIPS 2024] Self-Optimization Improves the Efficiency of Code Generation
☆15May 10, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
omarkamali / borgllm
View on GitHub
A zero-config OpenAI client with support for 20+ providers, API key rotation, rate limits, optional LangChain integration and more.
☆19Dec 11, 2025Updated 7 months ago
reddy-lab-code-research / MuST-CoST
View on GitHub
Code and data for AAAI 2022 paper "Multilingual Code Snippets Training for Program Translation"
☆11Mar 7, 2022Updated 4 years ago
lahoramaker / awesome-lmstudio
View on GitHub
☆15Nov 13, 2023Updated 2 years ago
Ahmedfir / mBERTa
View on GitHub
CodeBERT based mutation testing tool.
☆13Nov 10, 2025Updated 8 months ago
gaiusyu / Denum
View on GitHub
A log compression tool (ASE2024)
☆17Apr 15, 2025Updated last year
iSEngLab / LLM4UT_Empirical
View on GitHub
[ISSTA 2025] A Large-scale Empirical Study on Fine-tuning Large Language Models for Unit Testing
☆13Feb 9, 2025Updated last year
Lorp / renderstack
View on GitHub
Web app that renders fonts using multiple renderers, including Samsa, Fontkit, Harfbuzz
☆11Jan 18, 2024Updated 2 years ago
dorpxam / einops-cpp
View on GitHub
C++17 implementation of einops for libtorch - clear and reliable tensor manipulations with einstein-like notation
☆12Oct 16, 2023Updated 2 years ago
mlj / ruby-sfst
View on GitHub
A wrapper for the Stuttgart Finite State Transducer Tools (SFST).
☆15May 27, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
JohnnyPeng18 / Coffe
View on GitHub
A Code Efficiency Benchmark for Code Generation
☆14May 26, 2025Updated last year
neighthan / gpu-utils
View on GitHub
Utility functions/scripts for working with GPUs.
☆10Jul 5, 2021Updated 5 years ago
macTracyHuang / NTU-ML2022-Spring
View on GitHub
Homeworks implementation of https://github.com/virginiakm1988/ML2022-Spring
☆13Jan 28, 2023Updated 3 years ago
hanxiao / embedding-compatibility-adapters
View on GitHub
Bridge incompatible embedding spaces with a single SVD. When your embedding provider deprecates a model, adapt instead of re-embedding.
☆36Apr 28, 2026Updated 2 months ago
santhoshtr / telegram-rss-reader
View on GitHub
A telegram bot to read RSS feeds
☆14Jul 26, 2024Updated last year
magic-YuanTian / STEPS
View on GitHub
Interactive SQL generation via editable step-by-step explanation
☆17Sep 12, 2024Updated last year
Shivanshu-Gupta / icl-coverage
View on GitHub
☆13Mar 5, 2024Updated 2 years ago