UbiquitousLearning / PhoneLMLinks

☆63

Alternatives and similar repositories for PhoneLM

Users that are interested in PhoneLM are comparing it to the libraries listed below

Sorting:

stevelaskaridis / awesome-mobile-llm
Awesome Mobile LLMs
☆253Updated 2 weeks ago
saic-fi / MobileQuant
[EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models
☆68Updated last year
18907305772 / FuseAI
FuseAI Project
☆87Updated 8 months ago
UbiquitousLearning / SLM_Survey
☆97Updated last year
antimatter15 / reverse-engineering-gemma-3n
Reverse Engineering Gemma 3n: Google's New Edge-Optimized Language Model
☆247Updated 4 months ago
hahnyuan / PB-LLM
PB-LLM: Partially Binarized Large Language Models
☆156Updated last year
hao-ai-lab / Dynasor
[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.
☆197Updated 4 months ago
OpenGVLab / EfficientQAT
[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
☆306Updated 4 months ago
microsoft / LongRoPE
LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.
☆260Updated last year
astramind-ai / Mixture-of-depths
Unofficial implementation for the paper "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆173Updated last year
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆92Updated 5 months ago
NVlabs / MaskLLM
[NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models
☆178Updated 9 months ago
thepowerfuldeez / OLMo
My fork os allen AI's OLMo for educational purposes.
☆30Updated 10 months ago
xuyuzhuang11 / OneBit
The homepage of OneBit model quantization framework.
☆193Updated 8 months ago
FasterDecoding / BitDelta
☆201Updated 10 months ago
YuchuanTian / RethinkTinyLM
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
☆123Updated 9 months ago
inclusionAI / Ring
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI, derived from Ling.
☆105Updated 2 months ago
imagination-research / sot
[ICLR 2024] Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation
☆176Updated last year
InternLM / SWE-Fixer
☆121Updated 5 months ago
hetailang / SqueezeAttention
☆38Updated last year
timinar / BabyLlama
Training code for Baby-Llama, our submission to the strict-small track of the BabyLM challenge.
☆84Updated 2 years ago
JackZeng0208 / llama.cpp-android-tutorial
llama.cpp tutorial on Android phone
☆133Updated 5 months ago
sanyalsunny111 / LLM-Inheritune
This is the official repository for Inheritune.
☆115Updated 8 months ago
bigai-nlco / TokenSwift
[ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation
☆114Updated 5 months ago
dust-tt / llama-ssp
Experiments on speculative sampling with Llama models
☆126Updated 2 years ago
IsaacRe / vllm-kvcompress
KV cache compression for high-throughput LLM inference
☆141Updated 8 months ago
Tencent / AngelSlim
Model compression toolkit engineered for enhanced usability, comprehensiveness, and efficiency.
☆178Updated this week
GAIR-NLP / PC-Agent-E
Efficient Agent Training for Computer Use
☆130Updated last month
Zoeyyao27 / SirLLM
This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM
☆60Updated last year
Tencent / llm.hunyuan.T1
☆84Updated 6 months ago