antirez / gguf-toolsLinks

GGUF implementation in C as a library and a tools CLI program

☆291

Alternatives and similar repositories for gguf-tools

Users that are interested in gguf-tools are comparing it to the libraries listed below

Sorting:

ggml-org / p1
LLM-based code completion engine
☆190Updated 9 months ago
kroggen / mamba.c
Inference of Mamba models in pure C
☆192Updated last year
google / minja
A minimalistic C++ Jinja templating engine for LLM chat templates
☆190Updated last month
abetlen / ggml-python
Python bindings for ggml
☆146Updated last year
kolinko / effort
An implementation of bucketMul LLM inference
☆223Updated last year
a1k0n / a1gpt
throwaway GPT inference
☆140Updated last year
trzy / llava-cpp-server
LLaVA server (llama.cpp).
☆183Updated 2 years ago
skeskinen / bert.cpp
ggml implementation of BERT
☆494Updated last year
Maknee / minigpt4.cpp
Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)
☆568Updated 2 years ago
ml-explore / mlx-c
C API for MLX
☆144Updated last month
monatis / clip.cpp
CLIP inference in plain C/C++ with no extra dependencies
☆523Updated 4 months ago
cmp-nct / ggllm.cpp
Falcon LLM ggml framework with CPU and GPU support
☆247Updated last year
cjpais / whisperfile
☆62Updated last year
danielgross / ggml-k8s
Run GGML models with Kubernetes.
☆173Updated last year
togethercomputer / redpajama.cpp
Extend the original llama.cpp repo to support redpajama model.
☆118Updated last year
ScalingIntelligence / tokasaurus
☆443Updated 2 months ago
kayvr / token-hawk
WebGPU LLM inference tuned by hand
☆150Updated 2 years ago
philipturner / metal-flash-attention
FlashAttention (Metal Port)
☆545Updated last year
IntrinsicLabsAI / gbnfgen
TypeScript generator for llama.cpp Grammar directly from TypeScript interfaces
☆140Updated last year
PABannier / biogpt.cpp
Port of Microsoft's BioGPT in C/C++ using ggml
☆85Updated last year
iamlemec / bert.cpp
GGML implementation of BERT model with Python bindings and quantization.
☆55Updated last year
Artefact2 / llm-eval
A super simple web interface to perform blind tests on LLM outputs.
☆28Updated last year
exo-explore / mlx-bitnet
1.58 Bit LLM on Apple Silicon using MLX
☆224Updated last year
umuthopeyildirim / DOOM-Mistral
Mistral7B playing DOOM
☆138Updated last year
xyzhang626 / embeddings.cpp
ggml implementation of embedding models including SentenceTransformer and BGE
☆59Updated last year
hscspring / llama.np
Inference Llama/Llama2/Llama3 Modes in NumPy
☆21Updated last year
Cerebras / gigaGPT
a small code base for training large models
☆310Updated 5 months ago
ggerganov / vit.cpp
Inference Vision Transformer (ViT) in plain C/C++ with ggml
☆30Updated last year
willccbb / mlx_parallm
Fast parallel LLM inference for MLX
☆223Updated last year
furiousteabag / vram-calculator
Transformer GPU VRAM estimator
☆67Updated last year