microsoft / MoPQ
☆12Updated 2 years ago
Related projects ⓘ
Alternatives and complementary repositories for MoPQ
- Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval☆15Updated 2 years ago
- ☆24Updated last year
- This package implements THOR: Transformer with Stochastic Experts.☆61Updated 3 years ago
- ☆72Updated last year
- ☆66Updated 2 years ago
- Official code for "Binary embedding based retrieval at Tencent"☆42Updated 8 months ago
- Retrieval with Learned Similarities☆15Updated this week
- ☆42Updated 3 years ago
- Odysseus: Playground of LLM Sequence Parallelism☆57Updated 5 months ago
- Inference framework for MoE layers based on TensorRT with Python binding☆41Updated 3 years ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆34Updated 8 months ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]☆49Updated last week
- ☆13Updated last year
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆147Updated 5 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆77Updated last month
- ☆109Updated 4 months ago
- ☆64Updated 7 months ago
- ☆88Updated last month
- ☆42Updated 6 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆70Updated last year
- [AAAI 2024] Fluctuation-based Adaptive Structured Pruning for Large Language Models☆37Updated 10 months ago
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆38Updated last month
- ☆18Updated 6 months ago
- Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind☆82Updated 8 months ago
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆20Updated 5 months ago
- Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation…☆31Updated last year
- code for paper 《RankingGPT: Empowering Large Language Models in Text Ranking with Progressive Enhancement》☆29Updated 10 months ago
- Source code for COLING 2022 paper "Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models"☆24Updated 2 years ago
- Source code for IJCAI 2022 Long paper: Parameter-Efficient Sparsity for Large Language Models Fine-Tuning.☆13Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆69Updated last year