pittisl / ElasticTrainerLinks

Code for paper "ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic Tensor Selection" (MobiSys'23)

☆13

Alternatives and similar repositories for ElasticTrainer

Users that are interested in ElasticTrainer are comparing it to the libraries listed below

Sorting:

cmd2001 / KVTuner
KVTuner: Sensitivity-Aware Layer-wise Mixed Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference
☆15Updated 2 months ago
Qualcomm-AI-research / gptvq
☆32Updated last year
SLDGroup / GradientFilter-CVPR23
☆12Updated last year
ModelTC / QLLM
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…
☆39Updated last year
Kyrie-Zhao / awesome-real-time-AI
This is a list of awesome edgeAI inference related papers.
☆96Updated last year
IntelLabs / Hardware-Aware-Automated-Machine-Learning
☆61Updated last month
MingSun-Tse / Why-the-State-of-Pruning-so-Confusing
[Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…
☆40Updated 2 years ago
eis-lab / sage
Experimental deep learning framework written in Rust
☆15Updated 2 years ago
iamkanghyunchoi / ait
It's All In the Teacher: Zero-Shot Quantization Brought Closer to the Teacher [CVPR 2022 Oral]
☆29Updated 2 years ago
1hunters / LIMPQ
Official implementation for ECCV 2022 paper LIMPQ - "Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance"
☆56Updated 2 years ago
Gaffey / ExCP
Official implementation of ICML 2024 paper "ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking".
☆48Updated last year
HuangOwen / RoLoRA
[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
☆36Updated 9 months ago
MAC-AutoML / OMPQ
☆25Updated 3 years ago
fmfi-compbio / admm-pruning
☆28Updated 11 months ago
UoS-EEC / DynamicOFA
[CVPRW 2021] Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded Platforms
☆29Updated 2 years ago
yxli2123 / LoSparse
☆58Updated last year
NVlabs / SMCP
☆21Updated 2 years ago
SNU-ARC / any-precision-llm
[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
☆111Updated last week
JarvisPei / CMoE
Implementation for the paper: CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference
☆22Updated 4 months ago
xudoong / EdgeVisionTransformer
To deploy Transformer models in CV to mobile devices.
☆18Updated 3 years ago
HayeonLee / HELP
Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight…
☆63Updated 11 months ago
ModelTC / awesome-lm-system
Summary of system papers/frameworks/codes/tools on training or serving large model
☆57Updated last year
TianjinYellow / StableSPAM
☆22Updated 3 months ago
shawnricecake / search-llm
[NeurIPS 2024] Search for Efficient LLMs
☆14Updated 6 months ago
L1aoXingyu / llm-infer-bench
☆11Updated last year
BaiTheBest / SparseLLM
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
☆64Updated 3 months ago
imagination-research / LCSC
[ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
☆15Updated 5 months ago
microsoft / chunk-attention
☆77Updated 2 months ago
SqueezeAILab / SqueezedAttention
SQUEEZED ATTENTION: Accelerating Long Prompt LLM Inference
☆50Updated 7 months ago
zhuzilin / pytorch-malloc
An external memory allocator example for PyTorch.
☆14Updated 3 years ago