WebAssembly (Wasm) Build and Bindings for llama.cpp
☆285Jul 23, 2024Updated last year
Alternatives and similar repositories for llama-cpp-wasm
Users that are interested in llama-cpp-wasm are comparing it to the libraries listed below
Sorting:
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆1,003Dec 17, 2025Updated 2 months ago
- Cleanai (https://github.com/willmil11/cleanai) except I'm making it in c now. Fast and clean from the start this time :)☆17Feb 5, 2026Updated 3 weeks ago
- ☆19Feb 7, 2024Updated 2 years ago
- Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation le…☆1,904Updated this week
- asynchronous/distributed speculative evaluation for llama3☆40Aug 8, 2024Updated last year
- Tensor library for machine learning☆273Apr 23, 2023Updated 2 years ago
- GGML implementation of BERT model with Python bindings and quantization.☆57Feb 19, 2024Updated 2 years ago
- Run Large-Language Models (LLMs) 🚀 directly in your browser!☆226Sep 8, 2024Updated last year
- Simple Tool Caller for llama.cpp☆11Aug 12, 2024Updated last year
- High-level, optionally asynchronous Rust bindings to llama.cpp☆243Jun 5, 2024Updated last year
- React Native binding of llama.cpp☆811Updated this week
- ☆12Jun 27, 2024Updated last year
- Inference of Mamba and Mamba2 models in pure C☆197Jan 22, 2026Updated last month
- Demos for AI assistants using NLUX, Next.js, React, and Node.js☆17Jun 24, 2024Updated last year
- Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon☆16May 8, 2025Updated 9 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆51Jul 30, 2024Updated last year
- A live multiplayer trivia game where users can bid for the subject of the next question☆29Jan 9, 2026Updated last month
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆23Sep 1, 2025Updated 5 months ago
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)☆816Feb 21, 2026Updated last week
- High-performance In-browser LLM Inference Engine☆17,398Feb 18, 2026Updated last week
- Minimalist stable-diffusion desktop application with only one executable file writen with golang ( No python ).☆18Apr 16, 2025Updated 10 months ago
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆857Nov 16, 2024Updated last year
- GGUF implementation in C as a library and a tools CLI program☆310Aug 28, 2025Updated 6 months ago
- Client-side vector database implementation in TypeScript☆12Jun 30, 2023Updated 2 years ago
- Vercel and web-llm template to run wasm models directly in the browser.☆169Nov 21, 2023Updated 2 years ago
- Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference in pure C/C++☆5,442Feb 19, 2026Updated last week
- Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆2,839Feb 10, 2026Updated 2 weeks ago
- A notebook interface that makes working with AI agents easier.☆15Jul 12, 2025Updated 7 months ago
- ☆11Sep 18, 2023Updated 2 years ago
- The rag pipeline for optimizing dynamic data editing.☆20Oct 30, 2025Updated 4 months ago
- Yet another `llama.cpp` Rust wrapper☆12Jun 19, 2024Updated last year
- Implementation of YOLO (You Only Look Once) computer Vision algorithm in a React UI, for the subject Intelligent Systems (ULL)☆10Jan 27, 2019Updated 7 years ago
- a pseudo-repo for discussion on Unix-like software in JS+Wasm ... and also about *browser* Python, Lua, Tcl.☆11Jan 25, 2023Updated 3 years ago
- ☆15Apr 9, 2025Updated 10 months ago
- ☆11Jun 25, 2024Updated last year
- Fast Python Vowpal Wabbit wrapper☆13Mar 31, 2021Updated 4 years ago
- LLama.cpp rust bindings☆414Jun 27, 2024Updated last year
- A super simple web interface to perform blind tests on LLM outputs.☆29Mar 9, 2024Updated last year
- ☆49Mar 9, 2025Updated 11 months ago