A curated list of early exiting (LLM, CV, NLP, etc)
☆74Aug 21, 2024Updated last year
Alternatives and similar repositories for early-exit-papers
Users that are interested in early-exit-papers are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A curated list of Early Exiting papers, benchmarks, and misc.☆119Oct 26, 2023Updated 2 years ago
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆81Jun 14, 2024Updated 2 years ago
- PyTorch implementation of the paper: Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate Feature Compression and Edge Le…☆47Oct 26, 2023Updated 2 years ago
- Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)☆66Sep 28, 2024Updated last year
- [ICML2022] Training Your Sparse Neural Network Better with Any Mask. Ajay Jaiswal, Haoyu Ma, Tianlong Chen, ying Ding, and Zhangyang Wang☆30Jul 24, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Code for the CVPR 2023 Paper: "Real-Time Evaluation in Online Continual Learning: A New Hope"☆20Aug 19, 2024Updated last year
- PyTorch implementation of the paper: Decomposing Vision Transformers for Collaborative Inference in Edge Devices☆18Jul 27, 2024Updated last year
- ☆22Mar 16, 2024Updated 2 years ago
- Work in progress LLM framework.☆16Oct 31, 2024Updated last year
- Official implementation of "Diffusion Language Models Know the Answer Before Decoding"☆56Apr 28, 2026Updated last month
- ☆15Apr 11, 2024Updated 2 years ago
- bitfusion verilog implementation☆13Feb 21, 2022Updated 4 years ago
- [Neurips 2021] Sparse Training via Boosting Pruning Plasticity with Neuroregeneration☆31Feb 11, 2023Updated 3 years ago
- ☆13Jun 29, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆17Jun 13, 2022Updated 4 years ago
- Miro[ACM MobiCom '23] Cost-effective On-device Continual Learning over Memory Hierarchy with Miro☆16Feb 1, 2024Updated 2 years ago
- [AAAI 2023] Scalable Attributed-Graph Subspace Clustering☆12Jul 16, 2023Updated 2 years ago
- Python package for Dec-POMDP files in the .dpomdp format☆11Oct 28, 2022Updated 3 years ago
- Fire Together Wire Together: A Dynamic Pruning Approach with Self-Supervised Mask Prediction☆10May 25, 2022Updated 4 years ago
- ☆16Oct 3, 2023Updated 2 years ago
- *ROS 입문자를 위한 실무 마스터 코스*☆10Jul 20, 2020Updated 5 years ago
- [UbiComp 2020] Intermittent Learning: On-Device Machine Learning on Intermittently Powered System☆10May 28, 2020Updated 6 years ago
- ☆40Jan 10, 2026Updated 5 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).☆163Jan 1, 2025Updated last year
- APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding☆14Jul 22, 2024Updated last year
- [Findings@ACL'26] LLMRouterBench: A Massive Benchmark and Unified Framework for LLM Routing☆68Apr 6, 2026Updated 2 months ago
- a benchmark to evaluate the situated inductive reasoning☆16Jan 7, 2025Updated last year
- Survey Paper List - Efficient LLM and Foundation Models☆265Sep 22, 2024Updated last year
- Vocabulary Parallelism☆26Mar 10, 2025Updated last year
- [ECMLPKDD 2020] "Topological Insights into Sparse Neural Networks"☆13May 2, 2022Updated 4 years ago
- [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models☆43Oct 28, 2025Updated 7 months ago
- Revisiting Test Time Adaptation Under Online Evaluation☆36May 2, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Explore, Establish, Exploit: Red Teaming Language Models from Scratch☆15Jun 21, 2023Updated 2 years ago
- ☆22Oct 22, 2024Updated last year
- [MobiSys 2020] Fast and Scalable In-memory Deep Multitask Learning via Neural Weight Virtualization☆15Jun 9, 2020Updated 6 years ago
- ☆12Apr 23, 2026Updated last month
- This is the official code for UGTs.☆13Feb 8, 2023Updated 3 years ago
- ☆10Apr 20, 2023Updated 3 years ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆372Apr 13, 2026Updated 2 months ago