☆38Mar 12, 2024Updated 2 years ago
Alternatives and similar repositories for mlx-lora
Users that are interested in mlx-lora are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- MLX Image Models☆24Mar 14, 2024Updated 2 years ago
- run embeddings in MLX☆98Sep 27, 2024Updated last year
- Simple Implementation of a Transformer in the new framework MLX by Apple☆19Nov 18, 2024Updated last year
- Gradio chat interface for FastMLX☆12Sep 22, 2024Updated last year
- For inferring and serving local LLMs using the MLX framework☆114Mar 24, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Karpathy's llama2.c transpiled to MLX for Apple Silicon☆14Dec 28, 2023Updated 2 years ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆102Jun 29, 2025Updated 9 months ago
- A CLI in Rust to generate synthetic data for MLX friendly training☆25Jan 13, 2024Updated 2 years ago
- Large Language Models (LLMs) applications and tools running on Apple Silicon in real-time with Apple MLX.☆462Jan 29, 2025Updated last year
- 🧠 Retrieval Augmented Generation (RAG) example☆19Feb 19, 2026Updated last month
- A tiny server to run local inference on MLX model in the style of OpenAI☆13Jan 31, 2024Updated 2 years ago
- huggingface chat-ui integration with mlx-lm server☆62Feb 13, 2024Updated 2 years ago
- ☆17Nov 15, 2025Updated 4 months ago
- Run and train GPT-2 on Apple silicon☆36Feb 6, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Multi-agent banking assistant with Python and Microsoft Agent Framework☆32Updated this week
- Generate train.jsonl and valid.jsonl files to use for fine-tuning Mistral and other LLMs.☆97Feb 5, 2024Updated 2 years ago
- Multi-threading, Concurrency, Asynchrony, and various Execution Methods implemented in a Rust backend for bleeding edge performance.☆20Nov 11, 2024Updated last year
- Annoucing Instructor Cloud☆38Aug 14, 2024Updated last year
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆53Apr 27, 2024Updated last year
- A collection of optimizers for MLX☆57Dec 12, 2025Updated 4 months ago
- Sentence Embedding as a Service☆15Jun 30, 2025Updated 9 months ago
- Distributed Inference for mlx LLm☆100Aug 1, 2024Updated last year
- An Chat Interface that implements realtime translation, audio-to-audio, using gpt-4o realtime☆22Jan 28, 2026Updated 2 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon☆16May 8, 2025Updated 11 months ago
- Minimal Claude Code alternative powered by MLX☆46Jan 11, 2026Updated 3 months ago
- ☆19Dec 9, 2023Updated 2 years ago
- Explore a simple example of utilizing MLX for RAG application running locally on your Apple Silicon device.☆180Jan 31, 2024Updated 2 years ago
- ☆36Mar 11, 2026Updated last month
- Chat²GPT is a ChatGPT (and DALL·E 2/3, and ElevenLabs) chat bot for Google Chat. 🤖💬☆11Feb 2, 2026Updated 2 months ago
- ☆67Mar 6, 2026Updated last month
- A simple github actions script to build a llamafile and uploads to huggingface☆17Jan 11, 2024Updated 2 years ago
- A simple script to enhance text editing across your Mac, leveraging the power of MLX. Designed for seamless integration, it offers real-t…☆109Mar 4, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A simple Jupyter Notebook for learning MLX text-completion fine-tuning!☆124Nov 10, 2024Updated last year
- A client-side web API for making requests directly to users.☆21Feb 5, 2024Updated 2 years ago
- MLX implementation of GCN, with benchmark on MPS, CUDA and CPU (M1 Pro, M2 Ultra, M3 Max).☆25Dec 16, 2023Updated 2 years ago
- A tool to learn how your gpu compares to others when using ollama☆13Jan 2, 2024Updated 2 years ago
- This code implements a Local LLM Selector from the list of Local Installed Ollama LLMs for your specific user Query☆105Nov 26, 2023Updated 2 years ago
- A reinforcement learning framework based on MLX.☆254Dec 1, 2025Updated 4 months ago
- Finetune Your Local LLM☆18Sep 23, 2023Updated 2 years ago