ngxson / wllamaLinks
WebAssembly binding for llama.cpp - Enabling on-browser LLM inference
β785Updated last week
Alternatives and similar repositories for wllama
Users that are interested in wllama are comparing it to the libraries listed below
Sorting:
- WebAssembly (Wasm) Build and Bindings for llama.cppβ273Updated last year
- Stateful load balancer custom-tailored for llama.cpp ππ¦β800Updated this week
- Vercel and web-llm template to run wasm models directly in the browser.β160Updated last year
- EntityDB is an in-browser vector database wrapping indexedDB and Transformers.js over WebAssemblyβ192Updated 2 months ago
- VS Code extension for LLM-assisted code/text completionβ873Updated this week
- Chat with AI large language models running natively in your browser. Enjoy private, server-free, seamless AI conversations.β807Updated 2 months ago
- The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM β¦β579Updated 5 months ago
- Apple MLX engine for LM Studioβ708Updated 2 weeks ago
- β286Updated this week
- Big & Small LLMs working togetherβ1,088Updated this week
- FastMLX is a high performance production ready API to host MLX models.β319Updated 4 months ago
- πΈοΈπ¦ A WASM vector similarity search written in Rustβ986Updated last year
- On-device LLM Inference Powered by X-Bit Quantizationβ260Updated last week
- Run Large-Language Models (LLMs) π directly in your browser!β212Updated 10 months ago
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.β1,551Updated last week
- A collection of π€ Transformers.js demos and example applicationsβ1,684Updated last week
- Suno AI's Bark model in C/C++ for fast text-to-speech generationβ834Updated 8 months ago
- Vectra is a local vector database for Node.js with features similar to pinecone but built using local files.β500Updated 2 months ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.β603Updated 9 months ago
- The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edgeβ1,466Updated last week
- An extremely fast implementation of whisper optimized for Apple Silicon using MLX.β755Updated last year
- Pure C++ implementation of several models for real-time chatting on your computer (CPU & GPU)β665Updated this week
- SemanticFinder - frontend-only live semantic search with transformers.jsβ287Updated 4 months ago
- Large Language Models (LLMs) applications and tools running on Apple Silicon in real-time with Apple MLX.β451Updated 6 months ago
- TypeScript generator for llama.cpp Grammar directly from TypeScript interfacesβ139Updated last year
- LLM-based code completion engineβ193Updated 6 months ago
- JS tokenizer for LLaMA 3 and LLaMA 3.1β114Updated 4 months ago
- MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. Iβ¦β454Updated 3 weeks ago
- JS tokenizer for LLaMA 1 and 2β355Updated last year
- Web-optimized vector database (written in Rust).β249Updated 5 months ago