akx / ggifyLinks
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆158Updated 4 months ago
Alternatives and similar repositories for ggify
Users that are interested in ggify are comparing it to the libraries listed below
Sorting:
- Download models from the Ollama library, without Ollama☆93Updated 9 months ago
- Unsloth Studio☆101Updated 4 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆129Updated 2 years ago
- A fast batching API to serve LLM models☆185Updated last year
- GRDN.AI app for garden optimization☆70Updated last year
- Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
- LLaVA server (llama.cpp).☆181Updated last year
- LLM inference in C/C++☆101Updated this week
- 1.58 Bit LLM on Apple Silicon using MLX☆221Updated last year
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆259Updated 5 months ago
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆272Updated 11 months ago
- Something similar to Apple Intelligence?☆61Updated last year
- Distributed Inference for mlx LLm☆93Updated last year
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated last year
- For inferring and serving local LLMs using the MLX framework☆109Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆92Updated 2 months ago
- ☆95Updated last week
- automatically quant GGUF models☆196Updated this week
- Practical and advanced guide to LLMOps. It provides a solid understanding of large language models’ general concepts, deployment techniqu…☆74Updated last year
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆49Updated 3 months ago
- ☆161Updated 3 weeks ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- API Server for Transformer Lab☆72Updated this week
- ☆116Updated 8 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆102Updated 8 months ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆162Updated last year
- ☆38Updated last year
- Gemma 2 optimized for your local machine.☆376Updated last year
- Experimental LLM Inference UX to aid in creative writing☆120Updated 8 months ago
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆22Updated last year