ngxson / ggml-easyLinks
Thin wrapper around GGML to make life easier
☆42Updated 2 months ago
Alternatives and similar repositories for ggml-easy
Users that are interested in ggml-easy are comparing it to the libraries listed below
Sorting:
- Python bindings for ggml☆146Updated last year
- Experiments with BitNet inference on CPU☆55Updated last year
- Simple high-throughput inference library☆155Updated 8 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆58Updated last year
- Inference of Mamba models in pure C☆196Updated last year
- Efficient non-uniform quantization with GPTQ for GGUF☆57Updated 4 months ago
- Video+code lecture on building nanoGPT from scratch☆68Updated last year
- A minimalistic C++ Jinja templating engine for LLM chat templates☆200Updated 3 months ago
- Local Qwen3 LLM inference. One easy-to-understand file of C source with no dependencies.☆154Updated 6 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆106Updated 7 months ago
- Course Project for COMP4471 on RWKV☆17Updated last year
- Profile your CoreML models directly from Python 🐍☆29Updated 4 months ago
- cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server a…☆41Updated 6 months ago
- C API for MLX☆159Updated last week
- TTS support with GGML☆215Updated 3 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated last year
- Use safetensors with ONNX 🤗☆81Updated this week
- Browse, search, and visualize ONNX models.☆34Updated 8 months ago
- High-performance FlashAttention-2 for AMD, Intel, and Apple GPUs. Drop-in replacement for PyTorch SDPA. Triton backend for ROCm (MI300X, …☆134Updated 2 weeks ago
- ☆46Updated 3 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 2 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated last year
- Find out why your CoreML model isn't running on the Neural Engine!☆30Updated last year
- Yet Another (LLM) Web UI, made with Gemini☆12Updated last year
- ☆34Updated 9 months ago
- ☆18Updated last year
- FMS Model Optimizer is a framework for developing reduced precision neural network models.☆20Updated last week
- ☆62Updated 6 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆30Updated last year
- A fast RWKV Tokenizer written in Rust☆54Updated 5 months ago