mzau / mlx-knifeLinks
ollama like cli tool for MLX models on huggingface (pull, rm, list, show, serve etc.)
☆126Updated last week
Alternatives and similar repositories for mlx-knife
Users that are interested in mlx-knife are comparing it to the libraries listed below
Sorting:
- MLX-GUI MLX Inference Server for Apple Silicone☆176Updated 2 weeks ago
- Instant Perfect Native MacOS Transcription☆50Updated 6 months ago
- A command-line utility to manage MLX models between your Hugging Face cache and LM Studio.☆77Updated 2 months ago
- Train Large Language Models on MLX.☆245Updated this week
- powerful and fast tool calling agents☆80Updated 10 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆263Updated 2 weeks ago
- GenAI & agent toolkit for Apple Silicon Mac, implementing JSON schema-steered structured output (3SO) and tool-calling in Python. For mor…☆132Updated last month
- This repo maintains a 'cheat sheet' for LLMs that are undertrained on mlx☆18Updated 10 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆100Updated 7 months ago
- ☆107Updated 3 months ago
- Thoughtful Lightning AI Assistant - Dual-engine system with DeepSeek reasoning and Groq inference, featuring Gradio UI, secure API manage…☆20Updated last year
- ☆79Updated last year
- ☆85Updated 4 months ago
- Distributed Inference for mlx LLm☆100Updated last year
- Examples on how to use various LLM providers with a Wine Classification problem☆129Updated 3 months ago
- Metadspy: The framework for specifying—not programming—language models☆88Updated 7 months ago
- Personal project, Generative AI, Streamlit, Python☆54Updated 9 months ago
- Grammar checker with a keyboard shortcut for Ollama and Apple MLX with Automator on macOS.☆82Updated last year
- For LLMs to better code with Jina API☆175Updated last month
- A pure MLX-based training pipeline for fine-tuning LLMs using GRPO on Apple Silicon.☆226Updated 3 months ago
- Deep research agents using MiniMax M2.1 interleaved thinking☆194Updated last month
- FastMLX is a high performance production ready API to host MLX models.☆341Updated 10 months ago
- ☆194Updated 6 months ago
- For inferring and serving local LLMs using the MLX framework☆110Updated last year
- The easiest way to run the fastest MLX-based LLMs locally☆308Updated last year
- ☆37Updated 11 months ago
- Start a server from the MLX library.☆196Updated last year
- Dabarqus is incredibly fast RAG that runs everywhere.☆59Updated last year
- ☆114Updated 7 months ago
- ☆50Updated 5 months ago