A simple "Be My Eyes" web app with a llama.cpp/llava backend
☆493Nov 28, 2023Updated 2 years ago
Alternatives and similar repositories for llavavision
Users that are interested in llavavision are comparing it to the libraries listed below
Sorting:
- llama.cpp with BakLLaVA model describes what does it see☆379Nov 8, 2023Updated 2 years ago
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆46Nov 6, 2023Updated 2 years ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆6,187Aug 10, 2024Updated last year
- Browse the web with GPT-4V and Vimium☆2,667Sep 25, 2024Updated last year
- Easily take an entire YouTube playlist and turn it into high quality transcripts using Whisper.☆658Feb 27, 2025Updated last year
- OpenCV+YOLO+LLAVA powered video surveillance system☆785Oct 21, 2025Updated 4 months ago
- An AI assistant built with PHP, Solr and LLM backend of choice. Proof of concept mostly.☆64Nov 5, 2023Updated 2 years ago
- Finetune llama2-70b and codellama on MacBook Air without quantization☆450Mar 28, 2024Updated last year
- Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but…☆2,026Jan 20, 2026Updated last month
- An open source approach to locally record and enable searching everything you view on your Mac.☆2,471May 30, 2024Updated last year
- Llama 2 Everywhere (L2E)☆1,529Aug 27, 2025Updated 6 months ago
- An Open Source text-to-speech system built by inverting Whisper.☆4,567Dec 14, 2025Updated 2 months ago
- 【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection☆3,452Dec 3, 2024Updated last year
- LLaVA server (llama.cpp).☆183Oct 20, 2023Updated 2 years ago
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones☆1,307Feb 5, 2026Updated last month
- ☆718Mar 6, 2024Updated last year
- An extensible, easy-to-use, and portable diffusion web UI 👨🎨☆1,672Aug 18, 2023Updated 2 years ago
- ☆25Dec 22, 2023Updated 2 years ago
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills☆763Feb 1, 2024Updated 2 years ago
- Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.☆866Jan 15, 2024Updated 2 years ago
- 3D to Photo is an open-source package by Dabble, that combines threeJS and Stable diffusion to build a virtual photo studio for product p…☆449Jan 10, 2024Updated 2 years ago
- Voice + Vision powered AI assistant that answers questions about any application, in context and in audio.☆1,158Dec 21, 2023Updated 2 years ago
- Local realtime voice AI☆2,434Nov 26, 2025Updated 3 months ago
- ☆3,372Feb 25, 2024Updated 2 years ago
- Agents Capable of Self-Editing Their Prompts / Python Code☆803Mar 15, 2024Updated last year
- 👀🧠 GPT-4 Vision x 💪⌨️ Vimium = Autonomous Web Agent☆168Nov 16, 2023Updated 2 years ago
- A RAG LLM co-pilot for browsing the web, powered by local LLMs☆1,515Jan 26, 2025Updated last year
- 🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and mo…☆1,162Jan 5, 2025Updated last year
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆224Dec 16, 2024Updated last year
- An LLM-based autonomous agent controlling real-world applications via RESTful APIs☆1,390Jun 7, 2024Updated last year
- A voice chat app☆1,194May 21, 2025Updated 9 months ago
- Generative AutoML for Tabular Data☆446Feb 3, 2025Updated last year
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,644Jul 31, 2024Updated last year
- ☆1,279Oct 24, 2023Updated 2 years ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,444Dec 9, 2025Updated 2 months ago
- Vision utilities for web interaction agents 👀☆1,755Nov 25, 2024Updated last year
- ChatData 🔍 📖 brings RAG to real applications with FREE✨ knowledge bases. Now enjoy your chat with 6 million wikipedia pages and 2 milli…☆177Nov 8, 2024Updated last year
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆568Aug 8, 2023Updated 2 years ago
- JS tokenizer for LLaMA 1 and 2☆363Jun 27, 2024Updated last year