brave-experiments / MELT-publicLinks

codebase for "MELTing Point: Mobile Evaluation of Language Transformers"

☆18

Alternatives and similar repositories for MELT-public

Users that are interested in MELT-public are comparing it to the libraries listed below

Sorting:

UbiquitousLearning / MobileFM
One-size-fits-all model for mobile AI, a novel paradigm for mobile AI in which the OS and hardware co-manage a foundation model that is c…
☆28Updated last year
Nokia-Bell-Labs / salted-dnns
(HotMobile'24) Salted Inference: Enhancing Privacy while Maintaining Efficiency of Split Inference in Mobile Computing
☆17Updated last year
microsoft / glinthawk
An LLM inference engine, written in C++
☆15Updated last month
stevelaskaridis / awesome-mobile-llm
Awesome Mobile LLMs
☆210Updated last week
apple / pfl-research
Simulation framework for accelerating research in Private Federated Learning
☆330Updated last month
facebookresearch / MemoryMosaics
Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.
☆46Updated 5 months ago
pan-x-c / EE-LLM
EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).
☆65Updated last year
eth-easl / deltazip
Compression for Foundation Models
☆33Updated 3 months ago
pilancilab / caldera
Compressing Large Language Models using Low Precision and Low Rank Decomposition
☆95Updated 7 months ago
vcskaushik / LLMzip
☆57Updated 6 months ago
Zyphra / zcookbook
Training hybrid models for dummies.
☆25Updated 6 months ago
yandex-research / swarm
Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
☆141Updated last year
google-deepmind / asyncdiloco
☆45Updated last year
mobiusml / aana_sdk
Aana SDK is a powerful framework for building AI enabled multimodal applications.
☆49Updated this week
ml-energy / leaderboard
How much energy do GenAI models consume?
☆45Updated 2 months ago
UNITES-Lab / MoE-Quantization
Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"
☆21Updated 2 weeks ago
UbiquitousLearning / Backpropagation_Free_Training_Survey
☆23Updated last year
RWKV / RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best…
☆50Updated 4 months ago
IST-DASLab / sparseprop
☆15Updated last year
jeho-lee / Awesome-On-Device-AI-Systems
☆66Updated last month
hpcgroup / loki
Algorithms for approximate attention in LLMs
☆18Updated 3 months ago
sebulo / LoQT
☆79Updated 8 months ago
VITA-Group / LoCoCo
[ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen
☆17Updated 10 months ago
CannyLab / anthology
[EMNLP 2024 Main] Virtual Personas for Language Models via an Anthology of Backstories
☆29Updated 8 months ago
hmarkc / parallel-prompt-decoding
Efficient LLM Inference Acceleration using Prompting
☆48Updated 8 months ago
AutonomicPerfectionist / PipeInfer
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
☆30Updated 8 months ago
UCDvision / NOLA
Code for NOLA, an implementation of "nola: Compressing LoRA using Linear Combination of Random Basis"
☆56Updated 10 months ago
ScalingIntelligence / good-kernels
Samples of good AI generated CUDA kernels
☆84Updated last month
Lucky-Lance / Expert_Sparsity
[ACL 2024] Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
☆94Updated last year
NVlabs / MaskLLM
[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models
☆171Updated 6 months ago