tangledgroup / llama-cpp-wasm
WebAssembly (Wasm) Build and Bindings for llama.cpp
☆214Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for llama-cpp-wasm
- WebAssembly binding for llama.cpp - Enabling in-browser LLM inference☆441Updated 3 weeks ago
- Vercel and web-llm template to run wasm models directly in the browser.☆124Updated last year
- Run Large-Language Models (LLMs) 🚀 directly in your browser!☆167Updated 2 months ago
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfaces☆131Updated 4 months ago
- Add local LLMs to your Web or Electron apps! Powered by Rust + WebGPU☆102Updated last year
- SemanticFinder - frontend-only live semantic search with transformers.js☆233Updated 2 months ago
- llama.cpp fork with additional SOTA quants and improved performance☆93Updated this week
- JS tokenizer for LLaMA 3 and LLaMA 3.1☆91Updated 3 months ago
- ☆102Updated last week
- Inference Llama 2 in one file of pure JavaScript(HTML)☆30Updated 4 months ago
- Vector Storage is a vector database that enables semantic similarity searches on text documents in the browser's local storage. It uses O…☆201Updated last year
- WebGPU LLM inference tuned by hand☆147Updated last year
- JS tokenizer for LLaMA 1 and 2☆343Updated 4 months ago
- A client side vector search library that can embed, store, search, and cache vectors. Works on the browser and node. It outperforms OpenA…☆170Updated 5 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆221Updated 6 months ago
- A cross-platform browser ML framework.☆625Updated this week
- ☆149Updated 4 months ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆63Updated 11 months ago
- LLM-based code completion engine☆175Updated last year
- Infrastructure for AI code interpreting that's powering E2B.☆221Updated this week
- Python bindings for ggml☆132Updated 2 months ago
- A multimodal, function calling powered LLM webui.☆208Updated last month
- ☆228Updated last month
- Web-optimized vector database (written in Rust).☆189Updated 4 months ago
- 1.58-bit LLaMa model☆79Updated 7 months ago
- LLM-powered lossless compression tool☆252Updated 3 months ago
- FastMLX is a high performance production ready API to host MLX models.☆219Updated this week
- ☆53Updated 3 months ago
- Fast parallel LLM inference for MLX☆149Updated 4 months ago
- Simple repo that compiles and runs llama2.c on the Web☆53Updated last year