facebookresearch / MobileLLMLinks
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
β1,404Updated 8 months ago
Alternatives and similar repositories for MobileLLM
Users that are interested in MobileLLM are comparing it to the libraries listed below
Sorting:
- π MINT-1T: A one trillion token multimodal interleaved dataset.β827Updated last year
- Everything about the SmolLM and SmolVLM family of modelsβ3,552Updated last month
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modelingβ939Updated last month
- TinyChatEngine: On-Device LLM Inference Libraryβ936Updated last year
- DataComp for Language Modelsβ1,404Updated 4 months ago
- Reaching LLaMA2 Performance with 0.1M Dollarsβ988Updated last year
- Minimalistic large language model 3D-parallelism trainingβ2,411Updated last month
- OLMoE: Open Mixture-of-Experts Language Modelsβ945Updated 3 months ago
- [ICLR-2025-SLLM Spotlight π₯]MobiLlama : Small Language Model tailored for edge devicesβ669Updated 8 months ago
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbonesβ1,306Updated last year
- Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.β2,080Updated last year
- Recipes to scale inference-time compute of open modelsβ1,123Updated 7 months ago
- Official implementation of Half-Quadratic Quantization (HQQ)β905Updated 3 weeks ago
- [NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attentionβ¦β1,171Updated 3 months ago
- Open weights language model from Google DeepMind, based on Griffin.β660Updated 7 months ago
- VPTQ, A Flexible and Extreme low-bit quantization algorithmβ671Updated 8 months ago
- A modern model graph visualizer and debuggerβ1,365Updated this week
- [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Accelerationβ3,414Updated 5 months ago
- [ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decodingβ1,314Updated 10 months ago
- Implementing DeepSeek R1's GRPO algorithm from scratchβ1,740Updated 8 months ago
- Official inference library for pre-processing of Mistral modelsβ846Updated last week
- Reference implementation of Megalodon 7B modelβ528Updated 7 months ago
- GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projectionβ1,637Updated last year
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Modelsβ1,652Updated last year
- Serving multiple LoRA finetuned LLM as oneβ1,131Updated last year
- llama3.np is a pure NumPy implementation for Llama 3 model.β993Updated 8 months ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAIβ1,406Updated last year
- Scalable data pre processing and curation toolkit for LLMsβ1,340Updated this week
- Sky-T1: Train your own O1 preview model within $450β3,367Updated 6 months ago
- Muon is Scalable for LLM Trainingβ1,397Updated 5 months ago