ngxson / ggml-easyLinks
Thin wrapper around GGML to make life easier
☆40Updated last month
Alternatives and similar repositories for ggml-easy
Users that are interested in ggml-easy are comparing it to the libraries listed below
Sorting:
- A minimalistic C++ Jinja templating engine for LLM chat templates☆202Updated 3 months ago
- Python bindings for ggml☆146Updated last year
- Video+code lecture on building nanoGPT from scratch☆68Updated last year
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆150Updated 5 months ago
- High-performance FlashAttention-2 for AMD, Intel, and Apple GPUs. Drop-in replacement for PyTorch SDPA. Triton backend for ROCm (MI300X, …☆122Updated this week
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated last year
- ☆34Updated 9 months ago
- Efficient non-uniform quantization with GPTQ for GGUF☆57Updated 3 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆58Updated last year
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆47Updated last month
- Browse, search, and visualize ONNX models.☆34Updated 7 months ago
- Experiments with BitNet inference on CPU☆55Updated last year
- Course Project for COMP4471 on RWKV☆17Updated last year
- Simple high-throughput inference library☆153Updated 7 months ago
- llama.cpp to PyTorch Converter☆34Updated last year
- High-throughput tensor loading for PyTorch☆213Updated 3 weeks ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆97Updated 7 months ago
- Train your own small bitnet model☆76Updated last year
- ☆62Updated 5 months ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆29Updated 9 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆104Updated 7 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆30Updated last year
- Yet Another (LLM) Web UI, made with Gemini☆12Updated last year
- Inference of Mamba models in pure C☆195Updated last year
- Use safetensors with ONNX 🤗☆78Updated 2 months ago
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 10 months ago
- TTS support with GGML☆202Updated 2 months ago
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated 2 years ago
- Benchmark your GPU with ease☆28Updated 7 months ago
- ☆66Updated 6 months ago