Community maintained hardware plugin for vLLM on Apple Silicon
☆1,261Jun 5, 2026Updated this week
Alternatives and similar repositories for vllm-metal
Users that are interested in vllm-metal are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14May 26, 2026Updated 2 weeks ago
- OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous bat…☆1,304May 31, 2026Updated last week
- MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. I…☆721May 9, 2026Updated last month
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.☆4,920Updated this week
- Train Large Language Models on MLX.☆374May 8, 2026Updated last month
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- MLX Transformers is a library that provides model implementation in MLX. It uses a similar model interface as HuggingFace Transformers an…☆77Mar 23, 2026Updated 2 months ago
- ollama like cli tool for MLX models on huggingface (pull, rm, list, show, serve etc.)☆147May 20, 2026Updated 2 weeks ago
- A repo of useful MLX skills.☆85Jan 25, 2026Updated 4 months ago
- ☆48Jan 3, 2026Updated 5 months ago
- Proof of concept for running moshi/hibiki using webrtc☆21Feb 28, 2025Updated last year
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆29Oct 15, 2024Updated last year
- MLX implementation of Hierarchical Reasoning Model (HRM) - Adaptive computation for complex reasoning tasks☆28Aug 27, 2025Updated 9 months ago
- Run GEPA on your favorite non-python libraries.☆35Jan 22, 2026Updated 4 months ago
- Decision orchestration and reconciliation for AI changes.☆43Mar 30, 2026Updated 2 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- An OpenAI API compatible FastAPI server that sits on top of the Anemll repo. Tested with Open WebUI.☆21Jan 21, 2026Updated 4 months ago
- A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI…☆339Jun 1, 2026Updated last week
- ☆15Dec 4, 2024Updated last year
- Simplified Data Management and Sharing for Kubernetes☆18Jun 2, 2026Updated last week
- Index of Evangelion Clock designs☆12Oct 15, 2023Updated 2 years ago
- A TOML provider for the Swift Configuration framework.☆27Feb 1, 2026Updated 4 months ago
- A collection of optimizers for MLX☆57Dec 12, 2025Updated 5 months ago
- Run LLMs with MLX☆5,541Updated this week
- Benchmark of Apple MLX operations on all Apple Silicon chips (GPU, CPU) + MPS and CUDA.☆230Apr 6, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.☆285Jun 16, 2025Updated 11 months ago
- High performance GPT-OSS MLX implementation☆39Aug 6, 2025Updated 10 months ago
- Following the same workflows as Kubernetes. Widely used in InftyAI community.☆13May 31, 2026Updated last week
- A lightweight library for working with incomplete or streaming JSON in Swift.☆36Jul 24, 2025Updated 10 months ago
- MLX: An array framework for Apple silicon☆26,600Updated this week
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated 2 years ago
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆40Dec 2, 2025Updated 6 months ago
- Shadowbox : A modern no-code AI instrument. UI thin client component.☆18Jan 8, 2026Updated 5 months ago
- 360M model running in the browser on WebGPU☆23Aug 20, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A toy text-to-image model trained from scratch.☆19Jun 9, 2025Updated last year
- A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…☆7,214Updated this week
- Community maintained hardware plugin for vLLM on AWS Neuron☆31May 28, 2026Updated last week
- ☆17Apr 11, 2024Updated 2 years ago
- MCP servers for Apple apps (Notes, etc.) on macOS via AppleScript☆103Mar 17, 2026Updated 2 months ago
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆242Oct 28, 2025Updated 7 months ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Oct 1, 2025Updated 8 months ago