akx / ggifyLinks
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆160Updated 4 months ago
Alternatives and similar repositories for ggify
Users that are interested in ggify are comparing it to the libraries listed below
Sorting:
- Download models from the Ollama library, without Ollama☆99Updated 10 months ago
- Falcon LLM ggml framework with CPU and GPU support☆247Updated last year
- ☆161Updated last month
- LLM inference in C/C++☆102Updated 3 weeks ago
- Maybe the new state of the art vision model? we'll see 🤷♂️☆166Updated last year
- A simple Jupyter Notebook for learning MLX text-completion fine-tuning!☆121Updated 10 months ago
- automatically quant GGUF models☆202Updated this week
- Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon☆272Updated last year
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆64Updated last year
- For inferring and serving local LLMs using the MLX framework☆109Updated last year
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆129Updated 2 years ago
- ☆116Updated 9 months ago
- SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.☆280Updated 3 months ago
- Scripts to create your own moe models using mlx☆90Updated last year
- GRDN.AI app for garden optimization☆70Updated last year
- LLaVA server (llama.cpp).☆182Updated last year
- Distributed Inference for mlx LLm☆95Updated last year
- Python bindings for ggml☆146Updated last year
- A fast batching API to serve LLM models☆187Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆161Updated last year
- ☆67Updated last year
- Unofficial python bindings for the rust llm library. 🐍❤️🦀☆76Updated 2 years ago
- A high-throughput and memory-efficient inference and serving engine for LLMs☆52Updated last year
- A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.☆261Updated 3 months ago
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆170Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆240Updated last year
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated last year
- Gemma 2 optimized for your local machine.☆376Updated last year
- Fast parallel LLM inference for MLX☆217Updated last year
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆117Updated last year