akx / ggifyLinks
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆161Updated 6 months ago
Alternatives and similar repositories for ggify
Users that are interested in ggify are comparing it to the libraries listed below
Sorting:
- Download models from the Ollama library, without Ollama☆104Updated 11 months ago
 - automatically quant GGUF models☆214Updated last week
 - ☆104Updated 2 months ago
 - A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆63Updated 2 years ago
 - LLaVA server (llama.cpp).☆183Updated 2 years ago
 - Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆160Updated 2 years ago
 - Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
 - ☆162Updated 2 months ago
 - A fast batching API to serve LLM models☆188Updated last year
 - Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆119Updated last year
 - Low-Rank adapter extraction for fine-tuned transformers models☆178Updated last year
 - LLM inference in C/C++☆103Updated 2 months ago
 - An OpenAI API compatible LLM inference server based on ExLlamaV2.☆25Updated last year
 - For inferring and serving local LLMs using the MLX framework☆109Updated last year
 - An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆264Updated 7 months ago
 - Unsloth Studio☆113Updated 6 months ago
 - Gradio based tool to run opensource LLM models directly from Huggingface☆96Updated last year
 - Easily convert HuggingFace models to GGUF-format for llama.cpp☆23Updated last year
 - Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆273Updated last year
 - A simple Jupyter Notebook for learning MLX text-completion fine-tuning!☆122Updated 11 months ago
 - Distributed Inference for mlx LLm☆97Updated last year
 - run ollama & gguf easily with a single command☆52Updated last year
 - Experimental LLM Inference UX to aid in creative writing☆123Updated 10 months ago
 - Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆51Updated 5 months ago
 - Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated 2 years ago
 - Gemma 2 optimized for your local machine.☆377Updated last year
 - Examples of models deployable with Truss☆207Updated this week
 - Python package wrapping llama.cpp for on-device LLM inference☆92Updated 3 weeks ago
 - The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆375Updated this week
 - AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆102Updated 10 months ago