brave-experiments / MELT-publicLinks
codebase for "MELTing Point: Mobile Evaluation of Language Transformers"
☆18Updated last year
Alternatives and similar repositories for MELT-public
Users that are interested in MELT-public are comparing it to the libraries listed below
Sorting:
- One-size-fits-all model for mobile AI, a novel paradigm for mobile AI in which the OS and hardware co-manage a foundation model that is c…☆29Updated last year
- Survey Paper List - Efficient LLM and Foundation Models☆257Updated last year
- The official implementation of TinyTrain [ICML '24]☆22Updated last year
- ☆83Updated last month
- Source code and datasets for Ekya, a system for continuous learning on the edge.☆107Updated 3 years ago
- ☆208Updated last year
- This is a list of awesome edgeAI inference related papers.☆98Updated last year
- (HotMobile'24) Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing☆17Updated last year
- zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation☆26Updated 4 years ago
- ☆25Updated last year
- ☆20Updated 2 years ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆115Updated 2 months ago
- ☆100Updated last year
- How much energy do GenAI models consume?☆47Updated 4 months ago
- A curated list of early exiting (LLM, CV, NLP, etc)☆62Updated last year
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆68Updated last year
- Code for paper "ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic Tensor Selection" (MobiSys'23)☆13Updated last year
- [MobiCom '23] AccuMO: Accuracy-Centric Multitask Offloading in Edge-Assisted Mobile Augmented Reality☆17Updated last year
- Awesome Mobile LLMs☆246Updated last week
- ☆16Updated last year
- GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM☆168Updated last year
- ☆58Updated 9 months ago
- This is the proof-of-concept CPU implementation of ASPEN used for the NeurIPS'23 paper ASPEN: Breaking Operator Barriers for Efficient Pa…☆12Updated last year
- Compression for Foundation Models☆35Updated 2 months ago
- ☆30Updated 2 years ago
- Official Implementation of SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks☆37Updated 7 months ago
- Measure and optimize the energy consumption of your AI applications!☆291Updated last month
- a curated list of high-quality papers on resource-efficient LLMs 🌱☆139Updated 6 months ago
- Compressing Large Language Models using Low Precision and Low Rank Decomposition☆99Updated 9 months ago
- ☆130Updated 11 months ago