ml-explore / mlx
MLX: An array framework for Apple silicon
☆20,457Updated this week
Alternatives and similar repositories for mlx:
Users that are interested in mlx are comparing it to the libraries listed below
- Examples in the MLX framework☆7,363Updated last week
- LLM inference in C/C++☆79,335Updated this week
- Inference Llama 2 in one file of pure C☆18,345Updated 9 months ago
- Python bindings for llama.cpp☆9,063Updated 3 weeks ago
- Tensor library for machine learning☆12,445Updated this week
- Official inference library for Mistral models☆10,203Updated last month
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆41,081Updated 4 months ago
- A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API☆11,783Updated 9 months ago
- A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆21,842Updated 8 months ago
- DSPy: The framework for programming—not prompting—language models☆24,061Updated this week
- Universal LLM Deployment Engine with ML Compilation☆20,548Updated last week
- LLM training in simple, raw C/CUDA☆26,518Updated last week
- High-speed Large Language Model Serving for Local Deployment☆8,191Updated 2 months ago
- 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading☆9,605Updated 8 months ago
- Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sag…☆21,842Updated this week
- Inference code for CodeLlama models☆16,293Updated 8 months ago
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆8,463Updated last year
- Distribute and run LLMs with a single file.☆22,335Updated 2 weeks ago
- Development repository for the Triton language and compiler☆15,447Updated this week
- Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.☆9,620Updated 10 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆46,848Updated this week
- Inference code for Llama models☆58,197Updated 3 months ago
- Modeling, training, eval, and inference code for OLMo☆5,560Updated last week
- Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥☆38,242Updated this week
- Fast and memory-efficient exact attention☆17,259Updated last week
- A modular graph-based Retrieval-Augmented Generation (RAG) system☆24,976Updated last week
- ☆8,612Updated 7 months ago
- LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath☆9,391Updated 9 months ago
- Instruct-tune LLaMA on consumer hardware☆18,902Updated 9 months ago
- An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.☆38,535Updated 3 weeks ago