vllm-project / vllm-metal
View external linksLinks

Community maintained hardware plugin for vLLM on Apple Silicon

☆447

Alternatives and similar repositories for vllm-metal

Users that are interested in vllm-metal are comparing it to the libraries listed below

Sorting:

huggingface / feel
View on GitHub
☆14Jun 25, 2025Updated 7 months ago
cubist38 / mlx-openai-server
View on GitHub
A high-performance API server that provides OpenAI-compatible endpoints for MLX models. Developed using Python and powered by the FastAPI…
☆217Updated this week
ToluClassics / mlx-transformers
View on GitHub
MLX Transformers is a library that provides model implementation in MLX. It uses a similar model interface as HuggingFace Transformers an…
☆73Nov 19, 2024Updated last year
modaic-ai / gepa-rpc
View on GitHub
Run GEPA on your favorite non-python libraries.
☆32Jan 22, 2026Updated 3 weeks ago
geronimi73 / Hana
View on GitHub
A toy text-to-image model trained from scratch.
☆19Jun 9, 2025Updated 8 months ago
netsuitephp / netsuite-laravel
View on GitHub
A laravel service provider for the netsuite-php library service
☆11Sep 10, 2025Updated 5 months ago
cg123 / rathe
View on GitHub
Tools for formatting large language model prompts.
☆13Dec 19, 2023Updated 2 years ago
aiming-lab / MIRA
View on GitHub
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought
☆26Nov 6, 2025Updated 3 months ago
madroidmaq / mlx-omni-server
View on GitHub
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. I…
☆662Dec 21, 2025Updated last month
kyutai-labs / jax-flash-attn3
View on GitHub
JAX bindings for the flash-attention3 kernels
☆20Jan 2, 2026Updated last month
vinhowe / piston
View on GitHub
Train small sequence models in your browser with WebGPU.
☆32Dec 3, 2025Updated 2 months ago
DaertML / context_distillation
View on GitHub
Framework to achieve context distillation in LLMs
☆15Nov 24, 2023Updated 2 years ago
OthersideAI / vllm
View on GitHub
A high-throughput and memory-efficient inference and serving engine for LLMs
☆13Nov 27, 2023Updated 2 years ago
TIGER-AI-Lab / VideoEval-Pro
View on GitHub
More reliable Video Understanding Evaluation
☆14Sep 23, 2025Updated 4 months ago
batson / mlux
View on GitHub
☆48Jan 3, 2026Updated last month
Blaizzy / mlx-embeddings
View on GitHub
MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.
☆273Feb 9, 2026Updated last week
TransferQueue / TransferQueue
View on GitHub
[Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…
☆13Jan 16, 2026Updated last month
FanHansen / creditmodel
View on GitHub
creditmodel, 模型，评分卡，scorecard, vintage, automatic modeling
☆11Aug 10, 2024Updated last year
alexgusevski / anemll-server
View on GitHub
An OpenAI API compatible FastAPI server that sits on top of the Anemll repo. Tested with Open WebUI.
☆18Jan 21, 2026Updated 3 weeks ago
N8python / binary-vectors-mlx
View on GitHub
MLX binary vectors and associated algorithms.
☆14Mar 13, 2025Updated 11 months ago
FoundationAgents / VR-Bench
View on GitHub
We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…
☆50Feb 4, 2026Updated last week
jeroen / maketools
View on GitHub
Exploring and Testing your Toolchain Configuration and System Packages
☆14Jan 25, 2025Updated last year
bentoml / sentence-embedding-bento
View on GitHub
Sentence Embedding as a Service
☆15Jun 30, 2025Updated 7 months ago
chenwanqq / candle-llava
View on GitHub
implement llava using candle
☆15Jun 9, 2024Updated last year
pappitti / modernbert-mlx
View on GitHub
Implementation of ModernBERT in MLX
☆20Jan 7, 2026Updated last month
willccbb / mlx_parallm
View on GitHub
Fast parallel LLM inference for MLX
☆247Jul 7, 2024Updated last year
SciPhi-AI / RAG-Performance
View on GitHub
Measuring RAG solutions throughput and latency
☆19Jul 23, 2024Updated last year
the-y-company / many
View on GitHub
Create R packages from many directories
☆18Mar 30, 2024Updated last year
MathDL / MathDLBookCode
View on GitHub
This website contains the python code accompanying the book "Mathematical Foundations of Deep Learning Models and Algorithms" by Konstant…
☆46Nov 24, 2025Updated 2 months ago
sunsetcoder / flightradar24-mcp-server
View on GitHub
Model Context Protocol server for Flight Tracking
☆45Jan 29, 2025Updated last year
transformerlab / transformerlab-api
View on GitHub
API Server for Transformer Lab
☆83Nov 20, 2025Updated 2 months ago
vllm-project / vllm-nccl
View on GitHub
Manages vllm-nccl dependency
☆17Jun 3, 2024Updated last year
OthersideAI / tinyGPT
View on GitHub
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
☆19Apr 26, 2022Updated 3 years ago
arthurcolle / openharmony-mlx
View on GitHub
High performance GPT-OSS MLX implementation
☆32Aug 6, 2025Updated 6 months ago
r-three / RAD
View on GitHub
Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model
☆45Oct 1, 2025Updated 4 months ago
distribute-dev / modelplexer
View on GitHub
Multiplexer for AI protocols, agents, models, and tools
☆24Aug 8, 2025Updated 6 months ago
FoundationAgents / AutoEnv
View on GitHub
Scaling Agentic Environments Automatically.
☆49Jan 22, 2026Updated 3 weeks ago
doomslide / autoloom
View on GitHub
Approximating the joint distribution of language models via MCTS
☆22Nov 3, 2024Updated last year
DevoidSloth / claude-tools
View on GitHub
A simple way to import claude artifacts locally
☆18Jul 12, 2025Updated 7 months ago

vllm-project / vllm-metalView external linksLinks

Alternatives and similar repositories for vllm-metal

vllm-project / vllm-metal
View external linksLinks