ngxson / ggml-easyLinks
Thin wrapper around GGML to make life easier
☆40Updated 4 months ago
Alternatives and similar repositories for ggml-easy
Users that are interested in ggml-easy are comparing it to the libraries listed below
Sorting:
- A minimalistic C++ Jinja templating engine for LLM chat templates☆193Updated last month
- Python bindings for ggml☆146Updated last year
- Experiments with BitNet inference on CPU☆54Updated last year
- Use safetensors with ONNX 🤗☆73Updated 3 weeks ago
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆140Updated 3 months ago
- Video+code lecture on building nanoGPT from scratch☆68Updated last year
- Course Project for COMP4471 on RWKV☆17Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆55Updated last year
- ☆62Updated 3 months ago
- Efficient non-uniform quantization with GPTQ for GGUF☆52Updated last month
- Inference of Mamba models in pure C☆192Updated last year
- Browse, search, and visualize ONNX models.☆35Updated 5 months ago
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 10 months ago
- Simple high-throughput inference library☆149Updated 5 months ago
- Find out why your CoreML model isn't running on the Neural Engine!☆26Updated last year
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆43Updated this week
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated last year
- TTS support with GGML☆184Updated 3 weeks ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆97Updated 4 months ago
- Input your VRAM and RAM and the toolchain will produce a GGUF model tuned to your system within seconds — flexible model sizing and lowes…☆62Updated this week
- ☆25Updated 10 months ago
- Profile your CoreML models directly from Python 🐍☆29Updated last month
- Easy to use, High Performant Knowledge Distillation for LLMs☆94Updated 5 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆95Updated 5 months ago
- ☆17Updated 10 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆102Updated 10 months ago
- 🤗 Optimum ONNX: Export your model to ONNX and run inference with ONNX Runtime☆70Updated last week
- AirLLM 70B inference with single 4GB GPU☆14Updated 4 months ago
- High-throughput tensor loading for PyTorch☆105Updated this week
- ☆136Updated last year