microsoft / MoPQ
☆12Updated 3 years ago
Alternatives and similar repositories for MoPQ:
Users that are interested in MoPQ are comparing it to the libraries listed below
- Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval☆15Updated 2 years ago
- ☆24Updated last year
- ☆72Updated last year
- Retrieval with Learned Similarities (RAILS, http://arxiv.org/abs/2407.15462)☆20Updated last month
- ☆66Updated 2 years ago
- This package implements THOR: Transformer with Stochastic Experts.☆61Updated 3 years ago
- Long Context Extension and Generalization in LLMs☆40Updated 3 months ago
- ☆43Updated 3 years ago
- [WWW 2024] The official repo for paper "Scalable and Effective Generative Information Retrieval".☆54Updated 8 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆85Updated 3 months ago
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆39Updated 2 months ago
- An Experiment on Dynamic NTK Scaling RoPE☆62Updated last year
- Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation…☆31Updated last year
- A toolkit for building dense retrievers with deep language models.☆55Updated 3 years ago
- Differentiable Product Quantization for End-to-End Embedding Compression.☆59Updated 2 years ago
- ☆93Updated 3 months ago
- Source code for COLING 2022 paper "Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models"☆24Updated 2 years ago
- This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).☆97Updated 2 years ago
- BANG is a new pretraining model to Bridge the gap between Autoregressive (AR) and Non-autoregressive (NAR) Generation. AR and NAR generat…☆28Updated 2 years ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆38Updated 10 months ago
- Official PyTorch implementation of IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact☆38Updated 7 months ago
- The official repository for "Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation", Shen…☆117Updated last year
- ☆20Updated last week
- The Efficiency Spectrum of LLM☆52Updated last year
- CIKM'21: JPQ substantially improves the efficiency of Dense Retrieval with 30x compression ratio, 10x CPU speedup and 2x GPU speedup.☆51Updated 2 years ago
- ConTextual Mask Auto-Encoder for Dense Passage Retrieval☆35Updated 2 months ago
- Implementation of "RankCSE: Unsupervised Sentence Representation Learning via Learning to Rank" (ACL 2023)☆47Updated 10 months ago
- Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)☆71Updated 2 years ago
- WSDM'22 Best Paper: Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval☆119Updated 5 months ago
- [NeurIPS 2024] The official implementation of "Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exitin…☆47Updated 6 months ago