taylorai / mlx_embedding_models
run embeddings in MLX
☆72Updated last month
Related projects ⓘ
Alternatives and complementary repositories for mlx_embedding_models
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆75Updated 3 weeks ago
- ☆38Updated 7 months ago
- Explore a simple example of utilizing MLX for RAG application running locally on your Apple Silicon device.☆146Updated 9 months ago
- MLX Swift implementation of Andrej Karpathy's Let's build GPT video☆53Updated 6 months ago
- Fast parallel LLM inference for MLX☆146Updated 4 months ago
- Scripts to create your own moe models using mlx☆86Updated 8 months ago
- MLX Transformers is a library that provides model implementation in MLX. It uses a similar model interface as HuggingFace Transformers an…☆51Updated 2 months ago
- ☆103Updated 7 months ago
- look how they massacred my boy☆53Updated 3 weeks ago
- Port of Suno's Bark TTS transformer in Apple's MLX Framework☆71Updated 8 months ago
- Simple examples using Argilla tools to build AI☆38Updated this week
- Very basic framework for parameterized large language model (Q)LoRa fine-tuning using mlx, mlx_lm, and OgbujiPT. Architecture for system…☆34Updated this week
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆53Updated this week
- ☆48Updated last year
- Start a server from the MLX library.☆159Updated 3 months ago
- ☆64Updated 5 months ago
- mlx implementations of various transformers, speedups, training☆34Updated 10 months ago
- Port of Andrej Karpathy's nanoGPT to Apple MLX framework.☆97Updated 8 months ago
- For inferring and serving local LLMs using the MLX framework☆89Updated 7 months ago
- mlx image models for Apple Silicon machines☆66Updated 6 months ago
- AI API implementation for Mac which supports tool-calling & other structured LLM response generation (e.g. conform to JSON schema)☆93Updated this week
- FastMLX is a high performance production ready API to host MLX models.☆212Updated last week
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆51Updated last week
- RAG example using DSPy, Gradio, FastAPI☆64Updated 6 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated 10 months ago
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆43Updated 6 months ago
- Build a Streamlit Chatbot using Langchain, ColBERT, Ragatouille, and ChromaDB☆116Updated 9 months ago
- Using modal.com to process FineWeb-edu data☆19Updated 2 months ago
- ☆96Updated 2 months ago
- 🤖 Headless IDE for AI agents☆128Updated this week