ziozzang / Mac_mlx_phi-2_server
Test server code for Phi-2 model. support OpenAI API spec
☆17Updated last year
Alternatives and similar repositories for Mac_mlx_phi-2_server:
Users that are interested in Mac_mlx_phi-2_server are comparing it to the libraries listed below
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆48Updated 9 months ago
- Run large models from the terminal using Apple MLX.☆28Updated 11 months ago
- a version of baby agi using dspy and typed predictors☆17Updated 11 months ago
- Roberta Question Answering using MLX.☆24Updated last year
- A function to do all☆35Updated 10 months ago
- Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon☆17Updated this week
- Very basic framework for parameterized large language model (Q)LoRA / (Q)Dora fine-tuning using mlx, mlx_lm, and OgbujiPT. Architecture …☆37Updated this week
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆28Updated last month
- A python command-line tool to download & manage MLX AI models from Hugging Face.☆17Updated 5 months ago
- Shared personal notes created while working with the Apple MLX machine learning framework☆21Updated 8 months ago
- Developer showcase of projects built on Cartesia☆16Updated 5 months ago
- ☆15Updated last year
- ☆15Updated 11 months ago
- Transcribe and summarize videos using whisper and llms on apple mlx framework☆73Updated last year
- Embedding models from Jina AI☆58Updated last year
- Cog wrapper for collabora/WhisperSpeech☆25Updated 11 months ago
- the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidly☆29Updated 4 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆72Updated 2 months ago
- Median is an open-source flashcard application that leverages the power of spaced repetition and artificial intelligence to transform the…☆22Updated 3 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated 3 months ago
- Falcon40B and 7B (Instruct) with streaming, top-k, and beam search☆40Updated last year
- MLX Swift implementation of Andrej Karpathy's Let's build GPT video☆57Updated 10 months ago
- Grammar checker with a keyboard shortcut for Ollama and Apple MLX with Automator on macOS.☆77Updated last year
- Experimenting text-embeddings-inference server on both CPU and GPU☆18Updated last year
- a simple create-llama template using llama-index v0.10 and integrated with Ollama☆10Updated 9 months ago
- ☆1Updated 7 months ago
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆24Updated 8 months ago