akx / ggifyLinks

Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp

☆161

Alternatives and similar repositories for ggify

Users that are interested in ggify are comparing it to the libraries listed below

Sorting:

akx / ollama-dl
Download models from the Ollama library, without Ollama
☆104Updated 11 months ago
leafspark / AutoGGUF
automatically quant GGUF models
☆214Updated last week
cmp-nct / ggllm.cpp
Falcon LLM ggml framework with CPU and GPU support
☆247Updated last year
NousResearch / Obsidian
Maybe the new state of the art vision model? we'll see 🤷‍♂️
☆165Updated last year
QuixiAI / OpenChatML
☆162Updated 2 months ago
unslothai / llama.cpp
LLM inference in C/C++
☆103Updated 2 months ago
wangcx18 / llm-vscode-inference-server
An endpoint server for efficiently serving quantized open-source LLMs for code.
☆57Updated 2 years ago
jllllll / exllama
A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.
☆63Updated 2 years ago
trzy / llava-cpp-server
LLaVA server (llama.cpp).
☆183Updated 2 years ago
JosefAlbers / Phi-3-Vision-MLX
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
☆273Updated last year
runpod-workers / worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
☆375Updated this week
chigkim / Ollama-MMLU-Pro
☆104Updated 2 months ago
huggingface / local-gemma
Gemma 2 optimized for your local machine.
☆377Updated last year
epolewski / EricLLM
A fast batching API to serve LLM models
☆188Updated last year
mzbac / mlx-llm-server
For inferring and serving local LLMs using the MLX framework
☆109Updated last year
TheBlokeAI / AIScripts
Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub
☆160Updated 2 years ago
da-z / mlx-ui
A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.
☆261Updated last week
willccbb / mlx_parallm
Fast parallel LLM inference for MLX
☆224Updated last year
nuance1979 / llama-server
LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.
☆130Updated 2 years ago
teknium1 / ShareGPT-Builder
☆116Updated 10 months ago
mistralai / vllm-release
A high-throughput and memory-efficient inference and serving engine for LLMs
☆52Updated last year
jndiogo / sibila
Extract structured data from local or remote LLM models
☆49Updated last year
mzbac / mlx_sharding
Distributed Inference for mlx LLm
☆97Updated last year
4dh / GRDN
GRDN.AI app for garden optimization
☆70Updated last year
matatonic / openedai-vision
An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.
☆264Updated 7 months ago
abetlen / ggml-python
Python bindings for ggml
☆146Updated last year
mustafaaljadery / mlxserver
Start a server from the MLX library.
☆192Updated last year
Vaibhavs10 / optimise-my-whisper
☆208Updated last year
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆178Updated last year
exo-explore / mlx-bitnet
1.58 Bit LLM on Apple Silicon using MLX
☆225Updated last year