brave-experiments / MELT-publicLinks
codebase for "MELTing Point: Mobile Evaluation of Language Transformers"
☆18Updated last year
Alternatives and similar repositories for MELT-public
Users that are interested in MELT-public are comparing it to the libraries listed below
Sorting:
- One-size-fits-all model for mobile AI, a novel paradigm for mobile AI in which the OS and hardware co-manage a foundation model that is c…☆29Updated last year
- The official implementation of TinyTrain [ICML '24]☆22Updated last year
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆69Updated last year
- Measure and optimize the energy consumption of your AI applications!☆307Updated 2 weeks ago
- Survey Paper List - Efficient LLM and Foundation Models☆258Updated last year
- Compression for Foundation Models☆35Updated 3 months ago
- How much energy do GenAI models consume?☆50Updated 3 weeks ago
- (HotMobile'24) Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing☆17Updated last year
- ☆25Updated last year
- Compressing Large Language Models using Low Precision and Low Rank Decomposition☆104Updated 11 months ago
- Prototyp MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism☆21Updated 7 months ago
- Efficient LLM Inference Acceleration using Prompting☆50Updated last year
- Official implementation for Training LLMs with MXFP4☆101Updated 6 months ago
- A curated list of early exiting (LLM, CV, NLP, etc)☆67Updated last year
- GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM☆169Updated last year
- LLM checkpointing for DeepSpeed/Megatron☆21Updated 3 weeks ago
- ☆21Updated last year
- ☆90Updated 3 weeks ago
- Awesome Mobile LLMs☆267Updated 3 weeks ago
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"☆145Updated last year
- Simulation framework for accelerating research in Private Federated Learning☆344Updated 2 weeks ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆344Updated 6 months ago
- ☆60Updated 11 months ago
- This is a list of awesome edgeAI inference related papers.☆99Updated last year
- Libraries for efficient and scalable group-structured dataset pipelines.☆25Updated 4 months ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆121Updated 4 months ago
- Source code and datasets for Ekya, a system for continuous learning on the edge.☆112Updated 3 years ago
- The official implementation of the paper "Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques (TMLR)".☆79Updated 7 months ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆110Updated last year
- [ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache☆331Updated last month