vllm-project / vllm-metalLinks
Community maintained hardware plugin for vLLM on Apple Silicon
☆217Updated this week
Alternatives and similar repositories for vllm-metal
Users that are interested in vllm-metal are comparing it to the libraries listed below
Sorting:
- Train Large Language Models on MLX.☆240Updated this week
- Verify Precision of all Kimi K2 API Vendor☆494Updated last week
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang☆96Updated this week
- The LLM abstraction layer for modern AI agent applications.☆496Updated this week
- Fast parallel LLM inference for MLX☆241Updated last year
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆226Updated 2 months ago
- Verifiers for LLM Reinforcement Learning☆80Updated 4 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆246Updated this week
- ☆236Updated last month
- Distributed Inference for mlx LLm☆100Updated last year
- Benchmarks comparing PyTorch and MLX on Apple Silicon GPUs☆92Updated last week
- A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports both GPU and CPU modes…☆304Updated this week
- KAN (Kolmogorov–Arnold Networks) in the MLX framework for Apple Silicon☆31Updated 6 months ago
- Harbor is a framework for running agent evaluations and creating and using RL environments.☆381Updated this week
- Super basic implementation (gist-like) of RLMs with REPL environments.☆435Updated last week
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆219Updated 4 months ago
- ☆158Updated 8 months ago
- A command-line interface tool for serving LLM using vLLM.☆461Updated last month
- 1.58 Bit LLM on Apple Silicon using MLX☆237Updated last year
- This repo maintains a 'cheat sheet' for LLMs that are undertrained on mlx☆18Updated 10 months ago
- Train embedding and reranker models for retrieval tasks on Apple Silicon with MLX☆171Updated 3 months ago
- Library for model distillation☆160Updated 4 months ago
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆85Updated 4 months ago
- Real-Time Detection of Hallucinated Entities in Long-Form Generation☆275Updated last month
- ollama like cli tool for MLX models on huggingface (pull, rm, list, show, serve etc.)☆122Updated last week
- Fused Qwen3 MoE layer for faster training, compatible with HF Transformers, LoRA, 4-bit quant, Unsloth☆223Updated this week
- Simple examples using Argilla tools to build AI☆57Updated last year
- Simple & Scalable Pretraining for Neural Architecture Research☆306Updated last month
- ☆68Updated 7 months ago
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆256Updated this week