akx / ggify
Tool to download models from Huggingface Hub and convert them to GGML/GGUF for llama.cpp
☆121Updated 4 months ago
Alternatives and similar repositories for ggify:
Users that are interested in ggify are comparing it to the libraries listed below
- Dagger functions to import Hugging Face GGUF models into a local ollama instance and optionally push them to ollama.com.☆114Updated 8 months ago
- Download models from the Ollama library, without Ollama☆56Updated 3 months ago
- automatically quant GGUF models☆154Updated this week
- ☆75Updated last month
- A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.☆241Updated 2 weeks ago
- A fast batching API to serve LLM models☆180Updated 9 months ago
- function calling-based LLM agents☆282Updated 5 months ago
- Extract structured data from local or remote LLM models☆41Updated 7 months ago
- Falcon LLM ggml framework with CPU and GPU support☆246Updated last year
- Distributed Inference for mlx LLm☆82Updated 6 months ago
- Pressure testing the context window of open LLMs☆22Updated 5 months ago
- For inferring and serving local LLMs using the MLX framework☆93Updated 10 months ago
- ☆38Updated 11 months ago
- run ollama & gguf easily with a single command☆49Updated 9 months ago
- Open-source Perplexity-like RAG app.☆104Updated 2 months ago
- Unsloth Studio☆56Updated 3 months ago
- Gradio based tool to run opensource LLM models directly from Huggingface☆90Updated 7 months ago
- ☆28Updated 4 months ago
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆527Updated 2 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆232Updated 8 months ago
- Experimental LLM Inference UX to aid in creative writing☆111Updated 2 months ago
- LLaMA Server combines the power of LLaMA C++ with the beauty of Chatbot UI.☆117Updated last year
- AI management tool☆113Updated 3 months ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆69Updated 5 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆70Updated 2 months ago
- ☆197Updated 8 months ago
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆67Updated 4 months ago
- A simple speech-to-text and text-to-speech AI chatbot that can be run fully offline.☆44Updated last year
- transparent proxy server for llama.cpp's server to provide automatic model swapping☆175Updated this week