brave-experiments / MELT-publicLinks
codebase for "MELTing Point: Mobile Evaluation of Language Transformers"
☆18Updated last year
Alternatives and similar repositories for MELT-public
Users that are interested in MELT-public are comparing it to the libraries listed below
Sorting:
- One-size-fits-all model for mobile AI, a novel paradigm for mobile AI in which the OS and hardware co-manage a foundation model that is c…☆29Updated last year
- Survey Paper List - Efficient LLM and Foundation Models☆257Updated last year
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆71Updated last year
- Awesome Mobile LLMs☆272Updated last week
- Simulation framework for accelerating research in Private Federated Learning☆345Updated last month
- Libraries for efficient and scalable group-structured dataset pipelines.☆25Updated 5 months ago
- Measure and optimize the energy consumption of your AI applications!☆311Updated last week
- A canonical source of GenAI energy benchmark and meausrements☆50Updated last month
- ☆94Updated this week
- Compressing Large Language Models using Low Precision and Low Rank Decomposition☆105Updated 11 months ago
- (HotMobile'24) Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing☆17Updated last year
- ☆25Updated last year
- Efficient LLM Inference Acceleration using Prompting☆51Updated last year
- Compression for Foundation Models☆34Updated 4 months ago
- Federated Learning Systems Paper List☆75Updated last year
- ☆211Updated last year
- GPU operators for sparse tensor operations☆35Updated last year
- Enhancing Efficiency in Multidevice Federated Learning through Data Selection☆13Updated last year
- Prototyp MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism☆23Updated 7 months ago
- Official implementation for Training LLMs with MXFP4☆109Updated 7 months ago
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models☆180Updated 10 months ago
- ☆101Updated last year
- Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"☆147Updated last year
- ☆21Updated last year
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆347Updated 6 months ago
- The official implementation of TinyTrain [ICML '24]☆23Updated last year
- [ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs☆121Updated 4 months ago
- This is a list of awesome edgeAI inference related papers.☆98Updated last year
- Explorations into some recent techniques surrounding speculative decoding☆293Updated 11 months ago
- Code for studying the super weight in LLM☆121Updated 11 months ago