vitoplantamura / OnnxStreamLinks
Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but also Mistral 7B on desktops and servers. ARM, x86, WASM, RISC-V supported. Accelerated by XNNPACK.
☆1,943Updated 2 weeks ago
Alternatives and similar repositories for OnnxStream
Users that are interested in OnnxStream are comparing it to the libraries listed below
Sorting:
- Llama 2 Everywhere (L2E)☆1,517Updated 4 months ago
- Stable Diffusion and Flux in pure C/C++☆4,112Updated 2 months ago
- ☆1,273Updated last year
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆567Updated last year
- This repository contains a pure C++ ONNX implementation of multiple offline AI models, such as StableDiffusion (1.5 and XL), ControlNet, …☆614Updated last year
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆817Updated 6 months ago
- C++ implementation for BLOOM☆809Updated 2 years ago
- Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.☆3,660Updated last year
- ☆1,025Updated last year
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,523Updated 2 months ago
- Fork of Facebooks LLaMa model to run on CPU☆772Updated 2 years ago
- Fast stable diffusion on CPU☆1,700Updated 2 weeks ago
- Simple UI for LLM Model Finetuning☆2,060Updated last year
- CLIP inference in plain C/C++ with no extra dependencies☆498Updated 9 months ago
- An extensible, easy-to-use, and portable diffusion web UI 👨🎨☆1,669Updated last year
- Python bindings for the Transformer models implemented in C/C++ using GGML library.☆1,864Updated last year
- ggml implementation of BERT☆491Updated last year
- Quantized inference code for LLaMA models☆1,048Updated 2 years ago
- A more memory-efficient rewrite of the HF transformers implementation of Llama for use with quantized weights.☆2,874Updated last year
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,187Updated 2 weeks ago
- How to run Stable Diffusion on Raspberry Pi 4☆89Updated 2 years ago
- Tiny Dream - An embedded, Header Only, Stable Diffusion C++ implementation☆261Updated last year
- SoTA Transformers with C-backend for fast inference on your CPU.☆308Updated last year
- Tensor computation with WebGPU acceleration☆618Updated 10 months ago
- A diffusion model to colorize black and white images☆775Updated last year
- Inference code and configs for the ReplitLM model family☆971Updated last year
- Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)☆2,634Updated 2 years ago
- Tensor library for machine learning☆12,591Updated last week
- Use Code Llama with Visual Studio Code and the Continue extension. A local LLM alternative to GitHub Copilot.☆571Updated 10 months ago
- C++ implementation for 💫StarCoder☆452Updated last year