A simple "Be My Eyes" web app with a llama.cpp/llava backend
☆493Nov 28, 2023Updated 2 years ago
Alternatives and similar repositories for llavavision
Users that are interested in llavavision are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- llama.cpp with BakLLaVA model describes what does it see☆379Nov 8, 2023Updated 2 years ago
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆47Nov 6, 2023Updated 2 years ago
- StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models☆6,222Aug 10, 2024Updated last year
- LLaVA server (llama.cpp).☆183Oct 20, 2023Updated 2 years ago
- Finetune llama2-70b and codellama on MacBook Air without quantization☆450Mar 28, 2024Updated last year
- ☆719Mar 6, 2024Updated 2 years ago
- 👀🧠 GPT-4 Vision x 💪⌨️ Vimium = Autonomous Web Agent☆168Nov 16, 2023Updated 2 years ago
- Browse the web with GPT-4V and Vimium☆2,664Sep 25, 2024Updated last year
- Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but…☆2,034Jan 20, 2026Updated 2 months ago
- OpenCV+YOLO+LLAVA powered video surveillance system☆788Oct 21, 2025Updated 5 months ago
- Easily take an entire YouTube playlist and turn it into high quality transcripts using Whisper.☆660Feb 27, 2025Updated last year
- LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills☆764Feb 1, 2024Updated 2 years ago
- 【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection☆3,466Dec 3, 2024Updated last year
- Mobile web app for audio "push-to-talk" + TTS chat interface with OpenAI-like APIs☆43Jan 11, 2024Updated 2 years ago
- An Open Source text-to-speech system built by inverting Whisper.☆4,576Dec 14, 2025Updated 3 months ago
- An extensible, easy-to-use, and portable diffusion web UI 👨🎨☆1,673Aug 18, 2023Updated 2 years ago
- An open source approach to locally record and enable searching everything you view on your Mac.☆2,473May 30, 2024Updated last year
- Llama 2 Everywhere (L2E)☆1,529Aug 27, 2025Updated 6 months ago
- An AI assistant built with PHP, Solr and LLM backend of choice. Proof of concept mostly.☆64Nov 5, 2023Updated 2 years ago
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones☆1,310Feb 5, 2026Updated last month
- Neum AI is a best-in-class framework to manage the creation and synchronization of vector embeddings at large scale.☆867Jan 15, 2024Updated 2 years ago
- 3D to Photo is an open-source package by Dabble, that combines threeJS and Stable diffusion to build a virtual photo studio for product p…☆449Jan 10, 2024Updated 2 years ago
- Deepmark AI enables a unique testing environment for language models (LLM) assessment on task-specific metrics and on your own data so yo…☆104Nov 24, 2023Updated 2 years ago
- Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)☆570Aug 8, 2023Updated 2 years ago
- A voice chat app☆1,199May 21, 2025Updated 10 months ago
- Vision utilities for web interaction agents 👀☆1,757Nov 25, 2024Updated last year
- ☆124Jan 1, 2024Updated 2 years ago
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.☆1,645Jul 31, 2024Updated last year
- Local realtime voice AI☆2,440Nov 26, 2025Updated 3 months ago
- Voice + Vision powered AI assistant that answers questions about any application, in context and in audio.☆1,158Dec 21, 2023Updated 2 years ago
- ☆3,370Feb 25, 2024Updated 2 years ago
- ☆1,274Oct 24, 2023Updated 2 years ago
- iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh☆50Oct 30, 2023Updated 2 years ago
- ☆25Dec 22, 2023Updated 2 years ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,468Mar 4, 2026Updated 3 weeks ago
- Simple UI for LLM Model Finetuning☆2,061Dec 21, 2023Updated 2 years ago
- An LLM-based autonomous agent controlling real-world applications via RESTful APIs☆1,395Jun 7, 2024Updated last year
- A transformer-based network model for pitch detection☆166Jul 29, 2025Updated 7 months ago
- Agents Capable of Self-Editing Their Prompts / Python Code☆802Mar 15, 2024Updated 2 years ago