NoakLiu / LLMEasyQuantLinks
A Serving System for Distributed and Parallel LLM Quantization [Efficient ML System]
☆26Updated 4 months ago
Alternatives and similar repositories for LLMEasyQuant
Users that are interested in LLMEasyQuant are comparing it to the libraries listed below
Sorting:
- GraphSnapShot: Caching Local Structure for Fast Graph Learning [Efficient ML System]☆40Updated last month
- Zeroth-Order Fine-Tuning of LLMs in Random Subspaces (ICCV 2025)☆15Updated 11 months ago
- [TKDE'25] The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".☆441Updated 3 months ago
- Official code implementation for 2025 ICLR accepted paper "Dobi-SVD : Differentiable SVD for LLM Compression and Some New Perspectives"☆47Updated 3 weeks ago
- Pytorch Implementation of "Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models", AAAI 2…☆37Updated 6 months ago
- Adaptive Topology Reconstruction for Robust Graph Representation Learning [Efficient ML Model]☆10Updated 8 months ago
- Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"☆31Updated 6 months ago
- 📰 Must-read papers on KV Cache Compression (constantly updating 🤗).☆593Updated last month
- VisionSelector: End-to-End Learnable Visual Token Compression for Efficient Multimodal LLMs☆34Updated 2 weeks ago
- [ICML 2024] PyTorch implementation for "Diversified Batch Selection for Training Acceleration"☆10Updated last year
- A curated list for Efficient Large Language Models☆1,891Updated 4 months ago
- a curated list of high-quality papers on resource-efficient LLMs 🌱☆146Updated 7 months ago
- [ICML‘24] Official code for the paper "Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark ".☆113Updated 4 months ago
- Official implementation of MASS: Multi-Agent Simulation Scaling for Portfolio Construction☆151Updated last month
- [TMLR 2024] Efficient Large Language Models: A Survey☆1,227Updated 4 months ago
- [ICCV 2025 Highlight] Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning☆22Updated 3 months ago
- Survey Paper List - Efficient LLM and Foundation Models☆258Updated last year
- [TMLR 2025] Efficient Reasoning Models: A Survey☆275Updated last week
- [ECCV 2024] SparseRefine: Sparse Refinement for Efficient High-Resolution Semantic Segmentation☆15Updated 10 months ago
- Efficient Foundation Model Design: A Perspective From Model and System Co-Design [Efficient ML System & Model]☆25Updated 8 months ago
- ☆24Updated 11 months ago
- An automated feature engineering framework 'FETCH' accepted in ICLR 2023.☆11Updated 2 years ago
- ☆37Updated 3 years ago
- official implementation of β-DARTS: Beta-Decay Regularization for Differentiable Architecture Search (CVPR22 oral).☆86Updated 3 years ago
- [ICCV 2025] EA-ViT: Efficient Adaptation for Elastic Vision Transformer☆23Updated 3 months ago
- Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficien…☆127Updated 2 weeks ago
- Official implementation of ICLR 2025 'LORO: Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization'☆13Updated 6 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆34Updated last year
- [ICLR 2025] Mixture Compressor for Mixture-of-Experts LLMs Gains More☆61Updated 8 months ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆38Updated last year