brave-experiments / MELT-publicLinks
codebase for "MELTing Point: Mobile Evaluation of Language Transformers"
☆18Updated last year
Alternatives and similar repositories for MELT-public
Users that are interested in MELT-public are comparing it to the libraries listed below
Sorting:
- One-size-fits-all model for mobile AI, a novel paradigm for mobile AI in which the OS and hardware co-manage a foundation model that is c…☆29Updated last year
- Survey Paper List - Efficient LLM and Foundation Models☆259Updated last year
- ☆25Updated last year
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆73Updated last year
- (HotMobile'24) Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing☆17Updated last year
- Simulation framework for accelerating research in Private Federated Learning☆347Updated 2 months ago
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆27Updated 6 months ago
- A canonical source of GenAI energy benchmark and meausrements☆50Updated last month
- Efficient LLM Inference Acceleration using Prompting☆51Updated last year
- Enhancing Efficiency in Multidevice Federated Learning through Data Selection☆13Updated last year
- Awesome Mobile LLMs☆288Updated last month
- Libraries for efficient and scalable group-structured dataset pipelines.☆25Updated 6 months ago
- Measure and optimize the energy consumption of your AI applications!☆326Updated last week
- Code for CVPR24 Paper - Resource-Efficient Transformer Pruning for Finetuning of Large Models☆12Updated 2 months ago
- ☆213Updated last year
- ☆107Updated last week
- Code for studying the super weight in LLM☆121Updated last year
- FL_PyTorch: Optimization Research Simulator for Federated Learning☆35Updated 2 years ago
- ☆102Updated last year
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆111Updated last year
- Compressing Large Language Models using Low Precision and Low Rank Decomposition☆105Updated last month
- Repo for the paper: PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees (CVPR 2024)☆22Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆351Updated 8 months ago
- ☆21Updated last year
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models☆183Updated last year
- The official implementation of TinyTrain [ICML '24]☆23Updated last year
- This is a list of awesome edgeAI inference related papers.☆98Updated 2 years ago
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆123Updated 6 months ago
- [ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models☆113Updated last year
- Code for paper "ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic Tensor Selection" (MobiSys'23)☆14Updated 2 years ago