GGUF implementation in C as a library and a tools CLI program
☆311Aug 28, 2025Updated 7 months ago
Alternatives and similar repositories for gguf-tools
Users that are interested in gguf-tools are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A small utility library for parsing GGUF file info☆29Jan 27, 2025Updated last year
- Some random tools for working with the GGUF file format☆31Nov 24, 2023Updated 2 years ago
- A fork of llama3.c used to do some R&D on inferencing☆22Dec 20, 2024Updated last year
- ggml implementation of BERT☆500Feb 23, 2024Updated 2 years ago
- GGUF parser in Python☆28Aug 15, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Inference of Mamba, Mamba2 and Mamba3 models in pure C☆199Mar 18, 2026Updated 3 weeks ago
- CLIP inference in plain C/C++ with no extra dependencies☆557Jun 19, 2025Updated 9 months ago
- Fast neural codec compression and generation for audio waveforms☆230Dec 4, 2024Updated last year
- Port of Microsoft's BioGPT in C/C++ using ggml☆86Feb 21, 2024Updated 2 years ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆854Nov 16, 2024Updated last year
- Recreation of the BBC News Map that allows for quick selection of counties and towns☆23Oct 19, 2011Updated 14 years ago
- Local ML voice chat using high-end models.☆185Apr 3, 2026Updated last week
- Tensor library for machine learning☆14,394Updated this week
- HC-256 Stream cipher in x86 assembly☆19Nov 14, 2017Updated 8 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- GGML implementation of BERT model with Python bindings and quantization.☆57Feb 19, 2024Updated 2 years ago
- iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh☆50Oct 30, 2023Updated 2 years ago
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆313Apr 11, 2024Updated 2 years ago
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,567Mar 23, 2025Updated last year
- Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++☆5,726Updated this week
- Cross-platform binary launcher with Cosmopolitan libc☆34Apr 12, 2025Updated last year
- A collection of some lockfree datastructures☆80Apr 20, 2023Updated 2 years ago
- Fixed-point scalar and matrix multiplication library for SectorLISP☆15Jan 23, 2022Updated 4 years ago
- ☆131Nov 9, 2024Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- run ollama & gguf easily with a single command☆52May 15, 2024Updated last year
- A collection of experiments related to LLM inference with llama.cpp/mlx☆40Updated this week
- A new city of code on a cosmopolitan foundation.☆21Mar 19, 2021Updated 5 years ago
- lightweight, standalone C++ inference engine for Google's Gemma models.☆6,846Updated this week
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆1,029Dec 17, 2025Updated 3 months ago
- Inference Llama 2 in one file of pure C☆19,379Aug 6, 2024Updated last year
- FlashAttention (Metal Port)☆598Sep 22, 2024Updated last year
- tsellm: LLMs in SQLite and DuckDB☆24Apr 21, 2025Updated 11 months ago
- ☆65Aug 19, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆85Aug 20, 2025Updated 7 months ago
- Implementation of ModernBERT in MLX☆20Jan 7, 2026Updated 3 months ago
- Temporary mail - Keep your real mailbox clean and secure. Temp Mail provides temporary, secure, anonymous, free, disposable email address…☆13Mar 17, 2023Updated 3 years ago
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆855Apr 3, 2026Updated last week
- Port of Andrej Karpathy's nanoGPT to Apple MLX framework.☆120Feb 12, 2024Updated 2 years ago
- Using modal.com to process FineWeb-edu data☆20Apr 6, 2026Updated last week
- Yet Another (LLM) Web UI, made with Gemini☆12Dec 25, 2024Updated last year