tangledgroup / llama-cpp-wasm
WebAssembly (Wasm) Build and Bindings for llama.cpp
β246Updated 8 months ago
Alternatives and similar repositories for llama-cpp-wasm:
Users that are interested in llama-cpp-wasm are comparing it to the libraries listed below
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inferenceβ635Updated 2 weeks ago
- Vercel and web-llm template to run wasm models directly in the browser.β144Updated last year
- Run Large-Language Models (LLMs) π directly in your browser!β189Updated 6 months ago
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfacesβ135Updated 8 months ago
- On-device LLM Inference Powered by X-Bit Quantizationβ224Updated last week
- JS tokenizer for LLaMA 3 and LLaMA 3.1β105Updated 2 weeks ago
- Add local LLMs to your Web or Electron apps! Powered by Rust + WebGPUβ102Updated last year
- A JavaScript library that brings vector search and RAG to your browser!β105Updated 7 months ago
- A cross-platform browser ML framework.β669Updated 4 months ago
- JS tokenizer for LLaMA 1 and 2β351Updated 9 months ago
- LLM-based code completion engineβ181Updated 2 months ago
- ML-powered speech synthesis directly in your browserβ137Updated last month
- Simple repo that compiles and runs llama2.c on the Webβ53Updated last year
- A client side vector search library that can embed, store, search, and cache vectors. Works on the browser and node. It outperforms OpenAβ¦β191Updated 9 months ago
- llama.cpp fork with additional SOTA quants and improved performanceβ222Updated this week
- A multimodal, function calling powered LLM webui.β215Updated 6 months ago
- A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.β74Updated this week
- Open source LLM UI, compatible with all local LLM providers.β173Updated 6 months ago
- Distributed Inference for mlx LLmβ87Updated 7 months ago
- Port of Suno AI's Bark in C/C++ for fast inferenceβ53Updated 11 months ago
- Train your own small bitnet modelβ65Updated 5 months ago
- Fast parallel LLM inference for MLXβ174Updated 8 months ago
- LLM-powered lossless compression toolβ274Updated 7 months ago
- In-browser LLM website generatorβ46Updated last month
- WebGPU LLM inference tuned by handβ149Updated last year
- Inference Llama 2 in one file of pure JavaScript(HTML)β33Updated 8 months ago
- Segment Anything 2, 100% in the browser (with WebGPU!)β123Updated 3 months ago
- EntityDB is an in-browser vector database wrapping indexedDB and Transformers.js over WebAssemblyβ138Updated 2 months ago
- Visual Studio Code extension for WizardCoderβ147Updated last year
- 1.58-bit LLaMa modelβ82Updated 11 months ago