mzbac / mlx-llm-serverLinks

For inferring and serving local LLMs using the MLX framework

☆109

Alternatives and similar repositories for mlx-llm-server

Users that are interested in mlx-llm-server are comparing it to the libraries listed below

Sorting:

armbues / SiLLM
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
☆278Updated 4 months ago
da-z / mlx-ui
A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.
☆261Updated 4 months ago
mark-lord / MLX-text-completion-notebook
A simple Jupyter Notebook for learning MLX text-completion fine-tuning!
☆122Updated 11 months ago
nath1295 / MLX-Textgen
A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.
☆97Updated 3 months ago
chimezie / mlx-tuning-fork
Very basic framework for composable parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT.
☆42Updated 4 months ago
mustafaaljadery / mlxserver
Start a server from the MLX library.
☆192Updated last year
Blaizzy / mlx-embeddings
MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.
☆215Updated last month
OoriData / Toolio
GenAI & agent toolkit for Apple Silicon Mac, implementing JSON schema-steered structured output (3SO) and tool-calling in Python. For mor…
☆128Updated last month
apeatling / simple-guide-to-mlx-finetuning
Generate train.jsonl and valid.jsonl files to use for fine-tuning Mistral and other LLMs.
☆96Updated last year
arcee-ai / fastmlx
FastMLX is a high performance production ready API to host MLX models.
☆332Updated 7 months ago
mzbac / mlx_sharding
Distributed Inference for mlx LLm
☆97Updated last year
JosefAlbers / Phi-3-Vision-MLX
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
☆273Updated last year
vegaluisjose / mlx-rag
Explore a simple example of utilizing MLX for RAG application running locally on your Apple Silicon device.
☆177Updated last year
Jaykef / mlx-rag-gguf
Minimal, clean code implementation of RAG with mlx using gguf model weights
☆52Updated last year
taylorai / mlx_embedding_models
run embeddings in MLX
☆94Updated last year
willccbb / mlx_parallm
Fast parallel LLM inference for MLX
☆223Updated last year
mzbac / mlx-lora
☆38Updated last year
Goekdeniz-Guelmez / mlx-lm-lora
Train Large Language Models on MLX.
☆196Updated 3 weeks ago
ivanfioravanti / autogram
Grammar checker with a keyboard shortcut for Ollama and Apple MLX with Automator on macOS.
☆82Updated last year
otriscon / llm-structured-output
☆89Updated 9 months ago
mzau / mlx-knife
ollama like cli tool for MLX models on huggingface (pull, rm, list, show, serve etc.)
☆108Updated last week
epolewski / EricLLM
A fast batching API to serve LLM models
☆188Updated last year
j-csc / mlx_bark
Port of Suno's Bark TTS transformer in Apple's MLX Framework
☆84Updated last year
antranapp / awesome-mlx
☆189Updated 7 months ago
mlx-chat / mlx-chat-app
Chat with MLX is a high-performance macOS application that connects your local documents to a personalized large language model (LLM).
☆175Updated last year
teknium1 / ShareGPT-Builder
☆116Updated 10 months ago
mzbac / mlx-chat-ui
huggingface chat-ui integration with mlx-lm server
☆61Updated last year
adrienbrault / hf-gguf-to-ollama
Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.
☆118Updated last year
mzbac / mlx-moe
Scripts to create your own moe models using mlx
☆90Updated last year
PicoMLX / PicoMLXServer
The easiest way to run the fastest MLX-based LLMs locally
☆304Updated 11 months ago