Community maintained hardware plugin for vLLM on Apple Silicon
☆1,045Apr 29, 2026Updated this week
Alternatives and similar repositories for vllm-metal
Users that are interested in vllm-metal are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Apr 8, 2026Updated 3 weeks ago
- OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous bat…☆1,039Updated this week
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.☆4,523Updated this week
- Train Large Language Models on MLX.☆293Updated this week
- Windup Quickstarts☆11Aug 7, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. I…☆708Mar 10, 2026Updated last month
- MLX Transformers is a library that provides model implementation in MLX. It uses a similar model interface as HuggingFace Transformers an…☆76Mar 23, 2026Updated last month
- This Network-graph based literature review tool uses the open-source version of Neo4j with Jupyter Notebooks written in Python to import …☆14Oct 30, 2023Updated 2 years ago
- ☆48Jan 3, 2026Updated 3 months ago
- Proof of concept for running moshi/hibiki using webrtc☆21Feb 28, 2025Updated last year
- ☆10May 26, 2025Updated 11 months ago
- Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS (E2 TTS) in MLX☆29Oct 15, 2024Updated last year
- Run GEPA on your favorite non-python libraries.☆35Jan 22, 2026Updated 3 months ago
- EXO Gym is an open-source Python toolkit that facilitates distributed AI research.☆105Dec 1, 2025Updated 4 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI…☆321Updated this week
- An OpenAI API compatible FastAPI server that sits on top of the Anemll repo. Tested with Open WebUI.☆20Jan 21, 2026Updated 3 months ago
- ☆15Dec 4, 2024Updated last year
- Train and run transformers directly on Apple's Neural Engine in Swift bypass coreml entirely☆99Apr 18, 2026Updated last week
- Exploring and Testing your Toolchain Configuration and System Packages☆14Jan 25, 2025Updated last year
- Simplified Data Management and Sharing for Kubernetes☆18Apr 23, 2026Updated last week
- A TOML provider for the Swift Configuration framework.☆24Feb 1, 2026Updated 2 months ago
- An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API☆18Aug 21, 2025Updated 8 months ago
- An extension library to Candle that provides PyTorch functions not currently available in Candle☆42Mar 15, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A collection of optimizers for MLX☆57Dec 12, 2025Updated 4 months ago
- Benchmark of Apple MLX operations on all Apple Silicon chips (GPU, CPU) + MPS and CUDA.☆225Apr 6, 2026Updated 3 weeks ago
- SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.☆286Jun 16, 2025Updated 10 months ago
- Following the same workflows as Kubernetes. Widely used in InftyAI community.☆13Dec 5, 2025Updated 4 months ago
- Community maintained hardware plugin for vLLM on AWS Neuron☆29Mar 20, 2026Updated last month
- A lightweight library for working with incomplete or streaming JSON in Swift.☆36Jul 24, 2025Updated 9 months ago
- Run LLMs with MLX☆4,948Updated this week
- Measuring Thinking Efficiency in Reasoning Models - Research Repository☆39Dec 2, 2025Updated 4 months ago
- Shadowbox : A modern no-code AI instrument. UI thin client component.☆18Jan 8, 2026Updated 3 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- MLX: An array framework for Apple silicon☆25,814Updated this week
- Tackle Pathfinder application☆18Oct 21, 2023Updated 2 years ago
- FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.☆60Feb 6, 2026Updated 2 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆355Updated this week
- Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, T…☆435Updated this week
- 360M model running in the browser on WebGPU☆23Aug 20, 2024Updated last year
- Semantic memory system for Claude Code - provides persistent conversation memory through vector search of session summaries☆32Jul 21, 2025Updated 9 months ago