google-ai-edge / LiteRT-LMLinks
☆617Updated this week
Alternatives and similar repositories for LiteRT-LM
Users that are interested in LiteRT-LM are comparing it to the libraries listed below
Sorting:
- LiteRT, successor to TensorFlow Lite. is Google's On-device framework for high-performance ML & GenAI deployment on edge platforms, via e…☆1,163Updated this week
- ☆716Updated 3 weeks ago
- Train Large Language Models on MLX.☆236Updated 2 weeks ago
- Sparse Inferencing for transformer based LLMs☆215Updated 4 months ago
- Welcome to the official repository of SINQ! A novel, fast and high-quality quantization method designed to make any Large Language Model …☆585Updated this week
- Inference, Fine Tuning and many more recipes with Gemma family of models☆276Updated 5 months ago
- A command-line interface tool for serving LLM using vLLM.☆456Updated 3 weeks ago
- ☆426Updated 3 weeks ago
- ☆165Updated last week
- FastMLX is a high performance production ready API to host MLX models.☆339Updated 9 months ago
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆960Updated last week
- ☆153Updated 3 weeks ago
- ☆301Updated 4 months ago
- Fast Streaming TTS with Orpheus + WebRTC (with FastRTC)☆346Updated 8 months ago
- Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B☆553Updated last month
- Docs for GGUF quantization (unofficial)☆340Updated 5 months ago
- Examples, end-2-end tutorials and apps built using Liquid AI Foundational Models (LFM) and the LEAP SDK☆654Updated this week
- Liquid Audio - Speech-to-Speech audio models by Liquid AI☆304Updated 2 months ago
- 1.58 Bit LLM on Apple Silicon using MLX☆230Updated last year
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆451Updated 4 months ago
- MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. I…☆625Updated last week
- Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https…☆1,920Updated this week
- Qwen Image models through MPS☆246Updated this week
- Advanced quantization toolkit for LLMs and VLMs. Support for WOQ, MXFP4, NVFP4, GGUF, Adaptive Schemes and seamless integration with Tra…☆775Updated this week
- An implementation of the CSM(Conversation Speech Model) for Apple Silicon using MLX.☆391Updated 4 months ago
- Verify Precision of all Kimi K2 API Vendor☆487Updated last month
- Big & Small LLMs working together☆1,230Updated last week
- A minimalistic C++ Jinja templating engine for LLM chat templates☆202Updated 3 months ago
- No-code CLI designed for accelerating ONNX workflows☆222Updated 6 months ago
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆222Updated 2 months ago