akx / ggifyLinks
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆152Updated last month
Alternatives and similar repositories for ggify
Users that are interested in ggify are comparing it to the libraries listed below
Sorting:
- Download models from the Ollama library, without Ollama☆86Updated 7 months ago
- ☆157Updated 11 months ago
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆48Updated last month
- LLM inference in C/C++☆77Updated this week
- LLaVA server (llama.cpp).☆180Updated last year
- For inferring and serving local LLMs using the MLX framework☆104Updated last year
- ☆114Updated 6 months ago
- 1.58 Bit LLM on Apple Silicon using MLX☆214Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆86Updated this week
- This is our own implementation of 'Layer Selective Rank Reduction'☆239Updated last year
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆268Updated 9 months ago
- ☆95Updated 6 months ago
- Falcon LLM ggml framework with CPU and GPU support☆246Updated last year
- Python bindings for ggml☆141Updated 9 months ago
- Fast parallel LLM inference for MLX☆193Updated 11 months ago
- A fast batching API to serve LLM models☆183Updated last year
- A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.☆256Updated 2 weeks ago
- ☆66Updated last year
- automatically quant GGUF models☆184Updated last week
- FastMLX is a high performance production ready API to host MLX models.☆308Updated 3 months ago
- These are performance benchmarks we did to prepare for our own privacy-preserving and NDA-compliant in-house AI coding assistant. If by a…☆25Updated 2 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆222Updated last year
- A fast minimalistic implementation of guided generation on Apple Silicon using Outlines and MLX☆55Updated last year
- GRDN.AI app for garden optimization☆70Updated last year
- ☆101Updated 9 months ago
- Maybe the new state of the art vision model? we'll see 🤷♂️☆165Updated last year
- Extend the original llama.cpp repo to support redpajama model.☆118Updated 9 months ago
- Start a server from the MLX library.☆187Updated 11 months ago
- MLX-Embeddings is the best package for running Vision and Language Embedding models locally on your Mac using MLX.☆174Updated 3 weeks ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated last year