akx / ggifyLinks
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆165Updated 7 months ago
Alternatives and similar repositories for ggify
Users that are interested in ggify are comparing it to the libraries listed below
Sorting:
- Download models from the Ollama library, without Ollama☆115Updated last year
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated 2 years ago
- LLaVA server (llama.cpp).☆183Updated 2 years ago
- ☆108Updated 3 months ago
- LLM inference in C/C++☆103Updated this week
- Unsloth Studio☆120Updated 8 months ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated 2 years ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆99Updated 5 months ago
- Minimal, clean code implementation of RAG with mlx using gguf model weights☆53Updated last year
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆275Updated last month
- Gemma 2 optimized for your local machine.☆378Updated last year
- Maybe the new state of the art vision model? we'll see 🤷♂️☆167Updated last year
- An OpenAI API compatible API for chat with image input and questions about the images. aka Multimodal.☆266Updated 9 months ago
- GRDN.AI app for garden optimization☆69Updated 3 weeks ago
- Falcon LLM ggml framework with CPU and GPU support☆248Updated last year
- ☆68Updated last year
- automatically quant GGUF models☆218Updated last month
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆119Updated last year
- Unofficial python bindings for the rust llm library. 🐍❤️🦀☆76Updated 2 years ago
- A simple Jupyter Notebook for learning MLX text-completion fine-tuning!☆122Updated last year
- ☆164Updated 4 months ago
- ☆101Updated last year
- Distributed Inference for mlx LLm☆99Updated last year
- Pressure testing the context window of open LLMs☆25Updated last year
- Gradio based tool to run opensource LLM models directly from Huggingface☆96Updated last year
- Easy to use, High Performant Knowledge Distillation for LLMs☆97Updated 7 months ago
- For inferring and serving local LLMs using the MLX framework☆109Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆130Updated 2 years ago
- FRP Fork☆176Updated 8 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆179Updated last year