akx / ggifyLinks
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆161Updated 6 months ago
Alternatives and similar repositories for ggify
Users that are interested in ggify are comparing it to the libraries listed below
Sorting:
- Download models from the Ollama library, without Ollama☆104Updated 11 months ago
- automatically quant GGUF models☆214Updated last week
- Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
- Maybe the new state of the art vision model? we'll see 🤷♂️☆165Updated last year
- ☆162Updated 2 months ago
- LLM inference in C/C++☆103Updated 2 months ago
- An endpoint server for efficiently serving quantized open-source LLMs for code.☆57Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆63Updated 2 years ago
- LLaVA server (llama.cpp).☆183Updated 2 years ago
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆273Updated last year
- The RunPod worker template for serving our large language model endpoints. Powered by vLLM.☆375Updated this week
- ☆104Updated 2 months ago
- Gemma 2 optimized for your local machine.☆377Updated last year
- A fast batching API to serve LLM models☆188Updated last year
- For inferring and serving local LLMs using the MLX framework☆109Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆160Updated 2 years ago
- A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.☆261Updated last week
- Fast parallel LLM inference for MLX☆224Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆130Updated 2 years ago
- ☆116Updated 10 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- Extract structured data from local or remote LLM models☆49Updated last year
- Distributed Inference for mlx LLm☆97Updated last year
- GRDN.AI app for garden optimization☆70Updated last year
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆264Updated 7 months ago
- Python bindings for ggml☆146Updated last year
- Start a server from the MLX library.☆192Updated last year
- ☆208Updated last year
- Low-Rank adapter extraction for fine-tuned transformers models☆178Updated last year
- 1.58 Bit LLM on Apple Silicon using MLX☆225Updated last year