UbiquitousLearning / PhoneLMLinks
☆57Updated 6 months ago
Alternatives and similar repositories for PhoneLM
Users that are interested in PhoneLM are comparing it to the libraries listed below
Sorting:
- ☆95Updated 8 months ago
- LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation☆60Updated 9 months ago
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆116Updated 6 months ago
- [EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models☆63Updated 8 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆69Updated 2 weeks ago
- High-speed and easy-use LLM serving framework for local deployment☆109Updated 2 months ago
- KV cache compression for high-throughput LLM inference☆129Updated 4 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆121Updated 4 months ago
- Simple extension on vLLM to help you speed up reasoning model without training.☆158Updated this week
- ☆214Updated 3 weeks ago
- Awesome Mobile LLMs☆199Updated this week
- ☆48Updated 10 months ago
- ☆33Updated last month
- FuseAI Project☆87Updated 4 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆61Updated 9 months ago
- ☆37Updated 7 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated last year
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆106Updated 2 months ago
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models☆166Updated 5 months ago
- [ICLR 2024] Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation☆170Updated last year
- EE-LLM is a framework for large-scale training and inference of early-exit (EE) large language models (LLMs).☆63Updated 11 months ago
- [ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models☆272Updated 2 weeks ago
- A Stream-based LLM Agent Framework for Continuous Context Sensing and Sharing☆38Updated 6 months ago
- llama.cpp tutorial on Android phone☆102Updated last month
- Official Repository for Task-Circuit Quantization☆20Updated this week
- ☆72Updated last month
- ☆36Updated 2 years ago
- ☆197Updated 6 months ago
- The official repo for "LLoCo: Learning Long Contexts Offline"☆116Updated 11 months ago
- My fork os allen AI's OLMo for educational purposes.☆30Updated 6 months ago