raullenchai/Rapid-MLX

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/raullenchai/Rapid-MLX)

raullenchai / Rapid-MLX

The fastest local AI engine for Apple Silicon. 4.2x faster than Ollama, 0.08s cached TTFT, 100% tool calling. 17 tool parsers, prompt cache, reasoning separation, cloud routing. Drop-in OpenAI replacement. Works with Claude Code, Cursor, Aider.

☆3,289

Alternatives and similar repositories for Rapid-MLX

Users that are interested in Rapid-MLX are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

youssofal / MTPLX
View on GitHub
3x decode TPS increase On Qwen 3.6 27B @ temp 0.6 | Native MTP Speculative Decoding On Apple Silicon With No External Drafter.
☆1,056Updated this week
jundot / omlx
View on GitHub
LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar
☆17,950Updated this week
waybarrios / vllm-mlx
View on GitHub
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous bat…
☆1,446Jun 28, 2026Updated 3 weeks ago
bstnxbt / dflash-mlx
View on GitHub
Lossless DFlash speculative decoding for MLX on Apple Silicon
☆752Jun 11, 2026Updated last month
samuelfaj / lightning-mlx
View on GitHub
🔥 The fastest local AI engine for Apple Silicon. Optimised for agentic use.
☆82May 24, 2026Updated last month
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
jjang-ai / mlxstudio
View on GitHub
MLX Studio - Home of JANG_Q - Image Gen/Edit + Chat/Code All in one - + OpenClaw (Anthropic API)
☆908Updated this week
Aryagm / dflash-mlx
View on GitHub
Exact speculative decoding on Apple Silicon, powered by MLX.
☆380Apr 20, 2026Updated 3 months ago
osaurus-ai / osaurus
View on GitHub
Own your AI. The native macOS harness for AI agents -- any model, persistent memory, autonomous execution, cryptographic identity. Built …
☆7,234Updated this week
jjang-ai / vmlx
View on GitHub
vMLX - JANGTQ Uber Compressed MLX Models - L2 Disk Cache (survives restart) + L1 Paged (super fast ttft) + Hybrid SSM Scheduler + Cont B…
☆774Updated this week
SharpAI / SwiftLM
View on GitHub
⚡ Native MLX Swift LLM inference server for Apple Silicon. OpenAI-compatible API, SSD streaming for 100B+ MoE models, TurboQuant KV cache…
☆722May 19, 2026Updated 2 months ago
ARahim3 / mlx-tune
View on GitHub
Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, Vision, TTS, STT, Embedding, and OCR fine-tuning — natively on MLX. Unslot…
☆1,363Jun 23, 2026Updated 3 weeks ago
Blaizzy / mlx-vlm
View on GitHub
MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
☆5,175Updated this week
ml-explore / mlx-lm
View on GitHub
Run LLMs with MLX
☆6,335Jul 11, 2026Updated last week
antirez / ds4
View on GitHub
DeepSeek 4 Flash and PRO local inference engine for Metal, CUDA and ROCm
☆18,842Jul 3, 2026Updated 2 weeks ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Michael-A-Kuykendall / shimmy
View on GitHub
⚡ Pure-Rust WebGPU inference engine — OpenAI-API compatible, GGUF native, runs on any GPU. No Python. No llama.cpp. Single binary.
☆5,647Jun 30, 2026Updated 2 weeks ago
z-lab / dflash
View on GitHub
DFlash: Block Diffusion for Flash Speculative Decoding
☆5,496May 10, 2026Updated 2 months ago
h4ckf0r0day / obscura
View on GitHub
The headless browser for AI agents and web scraping
☆19,433Updated this week
apple / container
View on GitHub
A tool for creating and running Linux containers using lightweight virtual machines on a Mac. It is written in Swift, and optimized for A…
☆48,021Updated this week
nexu-io / open-design
View on GitHub
🎨 The open-source Claude Design alternative. 🖥️ Local-first desktop app. 🖼️ Your coding agent becomes the design engine: prototypes, l…
☆79,670Updated this week
NousResearch / hermes-agent
View on GitHub
The agent that grows with you
☆217,147Updated this week
addyosmani / agent-skills
View on GitHub
Production-grade engineering skills for AI coding agents.
☆79,265Updated this week
multica-ai / multica
View on GitHub
The open-source managed agents platform. Turn coding agents into real teammates — assign tasks, track progress, compound skills.
☆41,053Updated this week
trycua / cua
View on GitHub
Scale computer-use 2.0 with open-source drivers, cross-OS fleets, and benchmarks for training, evaluation, and data generation.
☆20,237Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
D4Vinci / Scrapling
View on GitHub
🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
☆70,136Updated this week
1weiho / open-slide
View on GitHub
A slide framework built for agents.
☆5,875Updated this week
tw93 / Mole
View on GitHub
🐹 Clean, uninstall, analyze, optimize, and monitor your Mac from the terminal.
☆59,337Updated this week
rohitg00 / agentmemory
View on GitHub
#1 Persistent memory for AI coding agents based on real-world benchmarks
☆25,354Updated this week
microsoft / VibeVoice
View on GitHub
Open-Source Frontier Voice AI
☆50,143Updated this week
opendataloader-project / opendataloader-pdf
View on GitHub
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
☆27,475Updated this week
Blaizzy / mlx-audio
View on GitHub
A text-to-speech (TTS), speech-to-text (STT) and speech-to-speech (STS) library built on Apple's MLX framework, providing efficient speec…
☆7,577Jul 10, 2026Updated last week
garrytan / gbrain
View on GitHub
Garry's Opinionated OpenClaw/Hermes Agent Brain
☆26,603Updated this week
OpenBMB / VoxCPM
View on GitHub
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
☆33,775Jul 8, 2026Updated last week
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
refactoringhq / tolaria
View on GitHub
Desktop app to manage markdown knowledge bases
☆18,740Updated this week
tinyhumansai / openhuman
View on GitHub
Your Personal AI super intelligence. A brain that builds a local-first memory of your life, a fantastic orchestrator of agent fleets and …
☆35,080Updated this week
Egonex-AI / Understand-Anything
View on GitHub
Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions abo…
☆75,148Updated this week
earendil-works / pi
View on GitHub
AI agent toolkit: unified LLM API, agent loop, TUI, coding agent CLI
☆72,762Updated this week
manaflow-ai / cmux
View on GitHub
Open source Ghostty-based macOS terminal with vertical tabs and notifications for AI coding agents. Built for multitasking, organization,…
☆24,779Updated this week
Thysrael / Horizon
View on GitHub
📡 Your own AI-powered news radar. Generates daily briefings in English & Chinese. | 用 AI 构建你专属的新闻雷达
☆8,279Updated this week
rtk-ai / rtk
View on GitHub
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
☆71,836Updated this week