Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
☆270Jun 5, 2026Updated 2 weeks ago
Alternatives and similar repositories for gguf-parser-go
Users that are interested in gguf-parser-go are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.☆5,173Updated this week
- LLM inference in C/C++☆23Oct 4, 2024Updated last year
- Scripts and tools for optimizing quantizations in llama.cpp with GGUF imatrices.☆19Jan 10, 2025Updated last year
- automatically quant GGUF models☆227Dec 23, 2025Updated 5 months ago
- llama.cpp fork with additional SOTA quants and improved performance☆2,737Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Kubernetes operator for local LLM inference with llama.cpp, vLLM, TGI, and mlx-server — multi-GPU NVIDIA + Apple Silicon Metal, autoscali…☆129Updated this week
- DuckDuckGo Web Search MCP Server - A simple web search implementation for Claude Desktop using DuckDuckGo API☆12Dec 1, 2024Updated last year
- A refeference of text models that can be used in the AI Horde☆12May 31, 2026Updated 2 weeks ago
- a cli/mcp server tool for managing mcp server json config file with version control, profiles and multi-client support☆10Feb 24, 2025Updated last year
- AI Assistant☆21Feb 21, 2026Updated 3 months ago
- Simple Web Extension to open a page with minimal browser ui☆21Mar 26, 2022Updated 4 years ago
- Open-source LLM/VLM load balancer and serving platform for self-hosting LLMs (and VLMs) at scale 🏓🦙 Alternative to projects like llm-d,…☆1,603Updated this week
- SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversati…☆29May 6, 2025Updated last year
- An unofficial collection of precompiled WebP binaries for all of Apple's current platforms.☆20Apr 19, 2022Updated 4 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- C# Generative AI SDK - Text, Image, Video and Audio supported☆26Updated this week
- llama.cpp gguf file parser for javascript☆50Dec 11, 2024Updated last year
- A wrapper shard for llama.cpp that acts as a client to work directly with AI models through llama.cpp from within Crystal applications☆19Updated this week
- Testing KAN-based text generation GPT models☆19May 6, 2024Updated 2 years ago
- Standalone, local-runnable binaries of popular linux distributions☆10Dec 15, 2021Updated 4 years ago
- Jinja2 chat templates for popular LLM models☆49Jun 21, 2024Updated last year
- AzukiはC# 2.0で書かれたフリーのテキストエディタエンジンです。オリジナル版を github で fork して拡張版を作成しています。☆11Feb 26, 2023Updated 3 years ago
- Package alg provides access to Linux AF_ALG sockets for communication with the Linux kernel crypto API. MIT Licensed.☆17May 11, 2021Updated 5 years ago
- Tiny Llama model trained to play chess☆30Jul 22, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Serving LLMs in the HF-Transformers format via a PyFlask API☆72Sep 10, 2024Updated last year
- GGUF implementation in C as a library and a tools CLI program☆336May 16, 2026Updated last month
- OpenPipe Reinforcement Learning Experiments☆33Mar 14, 2025Updated last year
- WiP: Traefik plugin on fail2ban middleware using XDP eBPF to drop packets☆11Sep 3, 2022Updated 3 years ago
- Structured, temporal memory for AI agents.☆85May 18, 2026Updated last month
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆896Jun 10, 2026Updated last week
- ☆24Jan 22, 2025Updated last year
- Produce your own Dynamic 3.0 Quants and achieve optimum accuracy & SOTA quantization performance! Input a target size and the toolchain w…☆139Updated this week
- An MCP Server to enable global access to Rememberizer☆35Apr 17, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Browse, search, and visualize ONNX models.☆35May 6, 2025Updated last year
- MindWork AI Studio is a free, independent cross-platform desktop app for local and cloud LLMs across providers, built to democratize AI a…☆497Jun 11, 2026Updated last week
- Llama.cpp runner/swapper and proxy that emulates LMStudio / Ollama backends☆59Aug 21, 2025Updated 9 months ago
- Get the Highest Android UI performance! XmlByPass is an annotationProcessor library for Android which auto generates the java code of you…☆14Nov 26, 2025Updated 6 months ago
- A TTS model capable of generating ultra-realistic dialogue in one pass.☆32May 1, 2025Updated last year
- Create text chunks which end at natural stopping points without using a tokenizer☆26Nov 26, 2025Updated 6 months ago
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆1,120Updated this week