ngxson / ggml-easyLinks
Thin wrapper around GGML to make life easier
☆40Updated 2 months ago
Alternatives and similar repositories for ggml-easy
Users that are interested in ggml-easy are comparing it to the libraries listed below
Sorting:
- Python bindings for ggml☆146Updated 11 months ago
- Experiments with BitNet inference on CPU☆54Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆170Updated 3 weeks ago
- GGML implementation of BERT model with Python bindings and quantization.☆57Updated last year
- Inference of Mamba models in pure C☆190Updated last year
- Use safetensors with ONNX 🤗☆69Updated last month
- Course Project for COMP4471 on RWKV☆17Updated last year
- Video+code lecture on building nanoGPT from scratch☆69Updated last year
- Simple high-throughput inference library☆127Updated 3 months ago
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆73Updated 6 months ago
- TTS support with GGML☆147Updated last week
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆83Updated 3 months ago
- AirLLM 70B inference with single 4GB GPU☆14Updated 2 months ago
- RWKV-7: Surpassing GPT☆94Updated 9 months ago
- ☆59Updated last month
- Train your own small bitnet model☆75Updated 10 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated last year
- Easy to use, High Performant Knowledge Distillation for LLMs☆92Updated 3 months ago
- Inference RWKV v7 in pure C.☆38Updated this week
- ☆30Updated 5 months ago
- Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)☆98Updated last month
- Port of Microsoft's BioGPT in C/C++ using ggml☆86Updated last year
- Nexusflow function call, tool use, and agent benchmarks.☆29Updated 8 months ago
- cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server a…☆42Updated last month
- 1.58 Bit LLM on Apple Silicon using MLX☆221Updated last year
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 8 months ago
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated last year
- Modified Mamba code to run on CPU☆31Updated last year
- Profile your CoreML models directly from Python 🐍☆28Updated 10 months ago
- A ggml (C++) re-implementation of tortoise-tts☆187Updated last year