mzbac / mlx-llm-server
For inferring and serving local LLMs using the MLX framework
☆89Updated 7 months ago
Related projects ⓘ
Alternatives and complementary repositories for mlx-llm-server
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆55Updated last week
- SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.☆226Updated this week
- Generate train.jsonl and valid.jsonl files to use for fine-tuning Mistral and other LLMs.☆77Updated 9 months ago
- Fast parallel LLM inference for MLX☆149Updated 4 months ago
- Port of Suno's Bark TTS transformer in Apple's MLX Framework☆71Updated 9 months ago
- A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.☆227Updated last month
- A fast batching API to serve LLM models☆172Updated 6 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆77Updated last month
- FastMLX is a high performance production ready API to host MLX models.☆218Updated 3 weeks ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆221Updated 6 months ago
- ☆38Updated 8 months ago
- A simple Jupyter Notebook for learning MLX text-completion fine-tuning!☆91Updated last week
- Start a server from the MLX library.☆161Updated 3 months ago
- Very basic framework for parameterized large language model (Q)LoRa fine-tuning using mlx, mlx_lm, and OgbujiPT. Architecture for system…☆35Updated last week
- Scripts to create your own moe models using mlx☆86Updated 8 months ago
- ☆112Updated this week
- ☆149Updated 4 months ago
- run embeddings in MLX☆73Updated last month
- Easily view and modify JSON datasets for large language models☆62Updated last month
- ☆104Updated 8 months ago
- Distributed Inference for mlx LLm☆70Updated 3 months ago
- AI API implementation for Mac which supports tool-calling & other structured LLM response generation (e.g. conform to JSON schema)☆93Updated 2 weeks ago
- ☆64Updated 5 months ago
- ☆99Updated 3 months ago
- MLX Swift implementation of Andrej Karpathy's Let's build GPT video☆54Updated 7 months ago
- Simple examples using Argilla tools to build AI☆40Updated this week
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆43Updated 6 months ago
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆45Updated last month
- Explore a simple example of utilizing MLX for RAG application running locally on your Apple Silicon device.☆145Updated 9 months ago
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆237Updated 2 months ago