ngxson / ggml-easyLinks
Thin wrapper around GGML to make life easier
☆39Updated 3 months ago
Alternatives and similar repositories for ggml-easy
Users that are interested in ggml-easy are comparing it to the libraries listed below
Sorting:
- GGML implementation of BERT model with Python bindings and quantization.☆55Updated last year
- Experiments with BitNet inference on CPU☆54Updated last year
- Python bindings for ggml☆146Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆187Updated 2 weeks ago
- Inference of Mamba models in pure C☆191Updated last year
- Video+code lecture on building nanoGPT from scratch☆68Updated last year
- TTS support with GGML☆180Updated this week
- Use safetensors with ONNX 🤗☆69Updated last week
- 🤗 Optimum ONNX: Export your model to ONNX and run inference with ONNX Runtime☆53Updated this week
- Simple high-throughput inference library☆142Updated 4 months ago
- Course Project for COMP4471 on RWKV☆17Updated last year
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆130Updated 3 months ago
- GPTQ and efficient search for GGUF☆50Updated 3 weeks ago
- ☆30Updated 6 months ago
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆42Updated last month
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 9 months ago
- ☆62Updated 2 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆29Updated 9 months ago
- A ggml (C++) re-implementation of tortoise-tts☆190Updated last year
- Browse, search, and visualize ONNX models.☆34Updated 5 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆90Updated 4 months ago
- webgpu autograd library☆32Updated 4 months ago
- AirLLM 70B inference with single 4GB GPU☆14Updated 3 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆95Updated 3 months ago
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆52Updated 7 months ago
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated 2 years ago
- Port of Microsoft's BioGPT in C/C++ using ggml☆85Updated last year
- ☆20Updated last week
- ☆43Updated last week
- C API for MLX☆134Updated last week