OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
☆1,160May 14, 2026Updated this week
Alternatives and similar repositories for vllm-mlx
Users that are interested in vllm-mlx are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI…☆337Updated this week
- Community maintained hardware plugin for vLLM on Apple Silicon☆1,130May 13, 2026Updated last week
- Agentic BYOK Browser-Based Website Builder☆44May 12, 2026Updated last week
- This repo maintains a 'cheat sheet' for LLMs that are undertrained on mlx☆33Mar 12, 2026Updated 2 months ago
- Voxel-based Editor☆13Jul 11, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Train Large Language Models on MLX.☆370May 8, 2026Updated last week
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.☆4,741Updated this week
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- Tiny evaluation of leading LLMs on competitive programming problems☆14Apr 10, 2026Updated last month
- General-purpose planning and execution harness for LLMs — structured phases, critique, gating, and review☆66Updated this week
- Context Query language for Agents☆62Apr 13, 2026Updated last month
- AI agent platform for building multi-agent systems with orchestration, memory, RAG, workflows, and enterprise observability.☆35Oct 27, 2025Updated 6 months ago
- MLX native implementations of state-of-the-art generative image models☆2,063Apr 10, 2026Updated last month
- MLX Studio - Home of JANG_Q - Image Gen/Edit + Chat/Code All in one - + OpenClaw (Anthropic API)☆721Updated this week
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆63Nov 23, 2025Updated 5 months ago
- 🚀 SuperMCP - Create multiple isolated MCP servers using a single connector. Build powerful Model Context Protocol integrations for datab…☆57Jan 26, 2026Updated 3 months ago
- ☆20Oct 25, 2025Updated 6 months ago
- Crosshair guidelines for ComfyUI to help align nodes and groups while moving or resizing.☆34Apr 28, 2026Updated 3 weeks ago
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆75May 13, 2026Updated last week
- ☆15Feb 23, 2026Updated 2 months ago
- Run LLMs with MLX☆5,298May 13, 2026Updated last week
- PolyCouncil is an open-source multi-model deliberation engine for LM Studio. It runs multiple LLMs in parallel, gathers their answers, sc…☆33Mar 24, 2026Updated last month
- Lossless DFlash speculative decoding for MLX on Apple Silicon☆680May 12, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- MLX-GUI MLX Inference Server for Apple Silicone☆210Apr 1, 2026Updated last month
- A simple, observable code-writing agent builder in TypeScript.☆33Apr 9, 2025Updated last year
- FastMLX is a high performance production ready API to host MLX models.☆25Nov 18, 2024Updated last year
- A modern Swift Package for controlling macOS media playback (play/pause, track info). Bypasses sandbox restrictions by bridging through t…☆42Apr 17, 2026Updated last month
- LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar☆14,244Updated this week
- MLX Implementation of Recursive Reasoning with Tiny Networks☆78Oct 11, 2025Updated 7 months ago
- Artificial Neural Engine Machine Learning Library☆1,599Mar 10, 2026Updated 2 months ago
- ☆71Feb 13, 2026Updated 3 months ago
- ☆21Oct 9, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Implementation of ModernBERT in MLX☆21Jan 7, 2026Updated 4 months ago
- An add-on to easily import shots, with their corresponding tracking datas and LiDAR scans recorded with the Omniscient iOS app, into Blen…☆15Nov 17, 2025Updated 6 months ago
- AudiosPlugin is a Godot iOS Audio Plugin that resolves the audio recording issue in iOS for Godot Engine.☆10Jun 16, 2025Updated 11 months ago
- A collection of Summoner clients and agents featuring example implementations and reusable templates☆24Feb 19, 2026Updated 3 months ago
- vibevoice real time 0.5B swift port☆29Dec 12, 2025Updated 5 months ago
- FastMLX is a high performance production ready API to host MLX models.☆356Mar 18, 2025Updated last year
- Diffusion Pipe for Windows For ComfyUI☆28Jan 20, 2026Updated 4 months ago