Jiayi-Pan / GPT-V-on-WebView external linksLinks
ππ§ GPT-4 Vision x πͺβ¨οΈ Vimium = Autonomous Web Agent
β168Nov 16, 2023Updated 2 years ago
Alternatives and similar repositories for GPT-V-on-Web
Users that are interested in GPT-V-on-Web are comparing it to the libraries listed below
Sorting:
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,066Dec 9, 2024Updated last year
- Browse the web with GPT-4V and Vimiumβ2,668Sep 25, 2024Updated last year
- GPT-4 Vision Chrome Extensionβ108Nov 12, 2023Updated 2 years ago
- A web-based tool that utilizes GPT-4's vision capabilities to analyze and describe system architecture diagrams, providing instant insighβ¦β16Nov 9, 2023Updated 2 years ago
- Desktop AI Data Scraperβ156Oct 10, 2023Updated 2 years ago
- Web Scraping with GPT-4 Vision API and Puppeteerβ563Jan 31, 2024Updated 2 years ago
- Website for the Open Interpreter projectβ32Mar 22, 2024Updated last year
- Interact privately with your documents using the power of GPT, 100% privately, no data leaksβ10May 22, 2023Updated 2 years ago
- A chat implementation for FastHTMLβ11Sep 14, 2025Updated 5 months ago
- Contains the model patches and the eval logs from the passing swe-bench-lite run.β10Jun 28, 2024Updated last year
- This AI Agent retrieves the latest news articles based on a multi keyword using the Serp API. It processes the results and returns structβ¦β11Jan 31, 2025Updated last year
- Globot is an agent that controls your browser using playwright and GPT-4V.β134Jan 4, 2024Updated 2 years ago
- A simple "Be My Eyes" web app with a llama.cpp/llava backendβ492Nov 28, 2023Updated 2 years ago
- Local & private voice controlled notepad using whisper.cppβ26Jan 21, 2024Updated 2 years ago
- β12Jul 2, 2024Updated last year
- Various agents from all of the top agent frameworks to integrate into swarms! Langchain, Griptape, CrewAI, and more!β18Dec 22, 2025Updated last month
- An unofficial implementation of Tensor4D with support for the D-NeRF datasetβ13Nov 8, 2023Updated 2 years ago
- Spacedrive native dependenciesβ13Apr 8, 2025Updated 10 months ago
- stream-of-consciousness experience of an AI's thinking process, complete with creative tangents and unexpected connections.β14Jan 29, 2025Updated last year
- β18Apr 3, 2023Updated 2 years ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β822Feb 3, 2025Updated last year
- Allows dictating anywhere in Windows using AutoHotKey and OpenAI's Whisper speech-to-text engine.β13Feb 21, 2024Updated last year
- The decentralized storage application for accelerating AI innovationβ17Apr 9, 2024Updated last year
- Markdown to ANSII in TypeScript based on Micro-Mark, with support for URLs, tables, lists and more.β35Jan 10, 2026Updated last month
- Anthropic MCP client for macOSβ16Jan 5, 2025Updated last year
- β14Feb 17, 2024Updated last year
- Offical Code for GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generationβ144Oct 30, 2024Updated last year
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β946Nov 5, 2025Updated 3 months ago
- pam-authramp | The AuthRamp PAM module provides an account lockout mechanism based on the number of authentication failures.β36Jul 27, 2024Updated last year
- β63Sep 23, 2024Updated last year
- Chrome extension that enables users to leverage the OpenAI Chat Completions endpoint on any YouTube video.β34Mar 23, 2024Updated last year
- Allows issuing voice commands in Windows via AutoHotKey scripts generated by ChatGPT.β14Jan 5, 2025Updated last year
- A Robo Lawyer Slack bot, powered by ChatGPTβ19Apr 4, 2023Updated 2 years ago
- A collection of utilities for FastHTML projects.β14Oct 23, 2024Updated last year
- β13Jul 18, 2023Updated 2 years ago
- Ask The Code is demo project to use Azure OpenAI to talk with your source code.β12May 17, 2024Updated last year
- A2A MCP Server is a lightweight Python bridge that lets Claude Desktop or any MCP client talk to A2A agents. It provides three tools: regβ¦β21May 4, 2025Updated 9 months ago
- YoutubeGPT is a web application powered by OpenAI's Whisper model for speech recognition and GPT-3 for text summarization. It extracts trβ¦β17May 17, 2023Updated 2 years ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't relβ¦β12Jan 29, 2024Updated 2 years ago