ngxson / wllamaLinks
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
☆732Updated last month
Alternatives and similar repositories for wllama
Users that are interested in wllama are comparing it to the libraries listed below
Sorting:
- WebAssembly (Wasm) Build and Bindings for llama.cpp☆267Updated 10 months ago
- A cross-platform browser ML framework.☆696Updated 6 months ago
- Stateful load balancer custom-tailored for llama.cpp 🏓🦙☆764Updated this week
- VS Code extension for LLM-assisted code/text completion☆774Updated this week
- On-device LLM Inference Powered by X-Bit Quantization☆241Updated 2 weeks ago
- 🕸️🦀 A WASM vector similarity search written in Rust☆963Updated last year
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM …☆566Updated 3 months ago
- FastMLX is a high performance production ready API to host MLX models.☆305Updated 2 months ago
- Chat with AI large language models running natively in your browser. Enjoy private, server-free, seamless AI conversations.☆753Updated 3 weeks ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆592Updated 7 months ago
- Big & Small LLMs working together☆814Updated this week
- Run Large-Language Models (LLMs) 🚀 directly in your browser!☆207Updated 8 months ago
- The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge☆1,402Updated this week
- Replace OpenAI with Llama.cpp Automagically.☆318Updated 11 months ago
- Apple MLX engine for LM Studio☆564Updated 2 weeks ago
- Sidecar is the AI brains for the Aide editor and works alongside it, locally on your machine☆558Updated 2 weeks ago
- A multi-platform desktop application to evaluate and compare LLM models, written in Rust and React.☆740Updated last month
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.☆1,293Updated this week
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.☆706Updated last year
- Vercel and web-llm template to run wasm models directly in the browser.☆149Updated last year
- EntityDB is an in-browser vector database wrapping indexedDB and Transformers.js over WebAssembly☆168Updated 3 weeks ago
- A JavaScript library that brings vector search and RAG to your browser!☆119Updated 9 months ago
- Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆2,074Updated last month
- Large-scale LLM inference engine☆1,435Updated this week
- MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. I…☆399Updated this week
- Minimal LLM inference in Rust☆986Updated 7 months ago
- 🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.☆520Updated 3 weeks ago
- Run LLMs with MLX☆877Updated this week
- Suno AI's Bark model in C/C++ for fast text-to-speech generation☆817Updated 6 months ago
- Web-optimized vector database (written in Rust).☆238Updated 3 months ago