akx / ggifyLinks
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆159Updated 5 months ago
Alternatives and similar repositories for ggify
Users that are interested in ggify are comparing it to the libraries listed below
Sorting:
- Download models from the Ollama library, without Ollama☆101Updated 10 months ago
- automatically quant GGUF models☆210Updated last week
- ☆102Updated last month
- LLM inference in C/C++☆102Updated last month
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆272Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
- A simple Jupyter Notebook for learning MLX text-completion fine-tuning!☆122Updated 11 months ago
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆117Updated last year
- Distributed Inference for mlx LLm☆96Updated last year
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆50Updated 4 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆96Updated 3 months ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆63Updated last year
- For inferring and serving local LLMs using the MLX framework☆109Updated last year
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆261Updated 7 months ago
- An endpoint server for efficiently serving quantized open-source LLMs for code.☆57Updated last year
- Unsloth Studio☆110Updated 6 months ago
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated 2 years ago
- ☆162Updated 2 months ago
- A fast batching API to serve LLM models☆187Updated last year
- Deploy your GGML models to HuggingFace Spaces with Docker and gradio☆37Updated 2 years ago
- 1.58 Bit LLM on Apple Silicon using MLX☆223Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆130Updated 2 years ago
- Examples of models deployable with Truss☆205Updated last week
- Let's create synthetic textbooks together :)☆75Updated last year
- AirLLM 70B inference with single 4GB GPU☆14Updated 3 months ago
- LLaVA server (llama.cpp).☆183Updated last year
- Scripts to create your own moe models using mlx☆90Updated last year
- Experimental LLM Inference UX to aid in creative writing☆122Updated 9 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆178Updated last year
- Generate train.jsonl and valid.jsonl files to use for fine-tuning Mistral and other LLMs.☆97Updated last year