microsoft / MoPQ
☆12Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for MoPQ
- Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval☆15Updated 2 years ago
- ☆24Updated last year
- ☆70Updated last year
- ☆42Updated 3 years ago
- Efficient Retrieval with Learned Similarities☆13Updated 3 months ago
- Official code for "Binary embedding based retrieval at Tencent"☆42Updated 8 months ago
- ☆65Updated 2 years ago
- ☆102Updated last year
- The Efficiency Spectrum of LLM☆52Updated 11 months ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆34Updated 8 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆74Updated 3 weeks ago
- ☆89Updated last month
- Implementation of "RankCSE: Unsupervised Sentence Representation Learning via Learning to Rank" (ACL 2023)☆46Updated 7 months ago
- TSDG: An efficient index graph for graph-based nearest neighbor search☆9Updated 2 years ago
- Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation…☆31Updated last year
- Odysseus: Playground of LLM Sequence Parallelism☆53Updated 4 months ago
- Dual Cross Encoder for Dense Retrieval☆16Updated last year
- This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).☆97Updated 2 years ago
- code for Scaling Laws of RoPE-based Extrapolation☆70Updated last year
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆142Updated 4 months ago
- The official code for Dropping Backward Propagation (DropBP)☆24Updated last week
- Inference framework for MoE layers based on TensorRT with Python binding☆41Updated 3 years ago
- ☆41Updated 5 months ago
- [WWW 2024] The official repo for paper "Scalable and Effective Generative Information Retrieval".☆51Updated 6 months ago
- ☆17Updated this week
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models☆47Updated last month
- A plug-in of Microsoft DeepSpeed to fix the bug of DeepSpeed pipeline☆26Updated 3 years ago
- Long Context Extension and Generalization in LLMs☆39Updated last month
- An Experiment on Dynamic NTK Scaling RoPE☆61Updated 11 months ago
- Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind☆79Updated 8 months ago