microsoft / MoPQLinks
☆12Updated 3 years ago
Alternatives and similar repositories for MoPQ
Users that are interested in MoPQ are comparing it to the libraries listed below
Sorting:
- ☆74Updated 2 years ago
- Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval☆15Updated 3 years ago
- Retrieval with Learned Similarities (http://arxiv.org/abs/2407.15462, WWW'25 Oral)☆43Updated last month
- ☆24Updated last year
- Official code for "Binary embedding based retrieval at Tencent"☆43Updated last year
- This package implements THOR: Transformer with Stochastic Experts.☆63Updated 3 years ago
- [NeurIPS 2023] Model-enhanced Vector Index☆26Updated last year
- ☆44Updated 3 years ago
- ☆66Updated 2 years ago
- [WWW 2024] The official repo for paper "Scalable and Effective Generative Information Retrieval".☆55Updated last year
- The official repository for "Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation", Shen…☆120Updated last year
- Inference framework for MoE layers based on TensorRT with Python binding☆41Updated 4 years ago
- WSDM'22 Best Paper: Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval☆120Updated 10 months ago
- Best practices for testing advanced Mixtral, DeepSeek, and Qwen series MoE models using Megatron Core MoE.☆17Updated this week
- CIKM'21: JPQ substantially improves the efficiency of Dense Retrieval with 30x compression ratio, 10x CPU speedup and 2x GPU speedup.☆52Updated 3 years ago
- implement bert in pure c++☆36Updated 5 years ago
- Odysseus: Playground of LLM Sequence Parallelism☆70Updated 11 months ago
- Official PyTorch implementation of "IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact"☆44Updated last year
- An all-in-one framework for Ad-hoc Information Retrieval.☆18Updated last year
- ☆34Updated 11 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆153Updated 11 months ago
- Repository of LV-Eval Benchmark☆65Updated 9 months ago
- Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation…☆32Updated last year
- Dynamic Context Selection for Efficient Long-Context LLMs☆28Updated 2 weeks ago
- A huggingface transformers implementation of "Transformer Memory as a Differentiable Search Index"☆172Updated 2 years ago
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆40Updated last year
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆72Updated 3 years ago
- Must-read papers on improving efficiency for pre-trained language models.☆103Updated 2 years ago
- Codes for our paper "Speculative Decoding: Exploiting Speculative Execution for Accelerating Seq2seq Generation" (EMNLP 2023 Findings)☆41Updated last year
- ☆22Updated 4 years ago