Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
☆269Mar 25, 2026Updated 2 months ago
Alternatives and similar repositories for gguf-parser-go
Users that are interested in gguf-parser-go are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LM inference server implementation based on *.cpp.☆294Nov 24, 2025Updated 6 months ago
- A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.☆5,052Updated this week
- LLM inference in C/C++☆23Oct 4, 2024Updated last year
- Scripts and tools for optimizing quantizations in llama.cpp with GGUF imatrices.☆19Jan 10, 2025Updated last year
- automatically quant GGUF models☆227Dec 23, 2025Updated 5 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A Go implementation of the HOTP (RFC 4226) and TOTP (RFC 6238) algorithms.☆21May 21, 2026Updated last week
- Accepts a Hugging Face model URL, automatically downloads and quantizes it using Bits and Bytes.☆38Mar 12, 2024Updated 2 years ago
- llama.cpp fork with additional SOTA quants and improved performance☆2,554May 23, 2026Updated last week
- Go as a shader language: converts Go code to SPIR-V via HLSL☆11Jan 9, 2024Updated 2 years ago
- A refeference of text models that can be used in the AI Horde☆12May 22, 2026Updated last week
- A go wrapper around the rwkv.cpp library☆20Mar 4, 2024Updated 2 years ago
- Docker/podman container for llama.cpp/vllm/exllamav{2,3} orchestrated using llama-swap☆18Apr 10, 2026Updated last month
- a cli/mcp server tool for managing mcp server json config file with version control, profiles and multi-client support☆10Feb 24, 2025Updated last year
- SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversati…☆29May 6, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- An unofficial collection of precompiled WebP binaries for all of Apple's current platforms.☆20Apr 19, 2022Updated 4 years ago
- C# ONNX Runtime SDK - Text, Image, Video and Audio supported☆20May 15, 2026Updated 2 weeks ago
- AirLLM 70B inference with single 4GB GPU☆20Jun 27, 2025Updated 11 months ago
- A fuzzy finder with GUI (inspired by fzf)☆14Oct 29, 2020Updated 5 years ago
- llama.cpp gguf file parser for javascript☆50Dec 11, 2024Updated last year
- Local transcription and speaker diarization with pyannote and parakeet☆31May 23, 2026Updated last week
- A wrapper shard for llama.cpp that acts as a client to work directly with AI models through llama.cpp from within Crystal applications☆19Jan 23, 2025Updated last year
- Standalone, local-runnable binaries of popular linux distributions☆10Dec 15, 2021Updated 4 years ago
- Jinja2 chat templates for popular LLM models☆48Jun 21, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Vaak is a AI Enabled Dictation keyboard. In Punjabi Vaak Refers to Utterance or Speech.☆24Oct 9, 2025Updated 7 months ago
- Tiny Llama model trained to play chess☆30Jul 22, 2025Updated 10 months ago
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- Inference RWKV v7 in pure C.☆44Oct 10, 2025Updated 7 months ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆72Sep 10, 2024Updated last year
- GGUF implementation in C as a library and a tools CLI program☆327May 16, 2026Updated last week
- ☆30May 11, 2026Updated 2 weeks ago
- Structured, temporal memory for AI agents.☆79May 18, 2026Updated last week
- SubFinder is an open-source tool designed to combat internet censorship☆20Jan 3, 2026Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆882May 21, 2026Updated last week
- ☆24Jan 22, 2025Updated last year
- A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.☆211Dec 23, 2025Updated 5 months ago
- This is the repo with the code to conduct a comparative analysis of different audio representation models.☆12Aug 31, 2023Updated 2 years ago
- Produce your own Dynamic 3.0 Quants and achieve optimum accuracy & SOTA quantization performance! Input a target size and the toolchain w…☆132Updated this week
- ☆20Sep 22, 2025Updated 8 months ago
- MindWork AI Studio is a free, independent cross-platform desktop app for local and cloud LLMs across providers, built to democratize AI a…☆484May 21, 2026Updated last week