👀🧠 GPT-4 Vision x 💪⌨️ Vimium = Autonomous Web Agent
☆168Nov 16, 2023Updated 2 years ago
Alternatives and similar repositories for GPT-V-on-Web
Users that are interested in GPT-V-on-Web are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI☆1,063Dec 9, 2024Updated last year
- GPT-4 Vision Chrome Extension☆108Nov 12, 2023Updated 2 years ago
- Browse the web with GPT-4V and Vimium☆2,664Sep 25, 2024Updated last year
- Example use cases for the GPT-4 Vision API☆19Nov 26, 2023Updated 2 years ago
- [ICML 2024] Self-Infilling Code Generation☆18May 5, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Web Scraping with GPT-4 Vision API and Puppeteer☆562Jan 31, 2024Updated 2 years ago
- Interact privately with your documents using the power of GPT, 100% privately, no data leaks☆10May 22, 2023Updated 2 years ago
- A CLI speech recognition tool, using OpenAI Whisper, supports audio file transcription and near-realtime microphone input.☆22Mar 14, 2026Updated 2 weeks ago
- A simple "Be My Eyes" web app with a llama.cpp/llava backend☆493Nov 28, 2023Updated 2 years ago
- GPT-4V in Wonderland: LMMs as Smartphone Agents☆134Jul 17, 2024Updated last year
- Allows issuing voice commands in Windows via AutoHotKey scripts generated by ChatGPT.☆14Jan 5, 2025Updated last year
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…☆839Feb 3, 2025Updated last year
- Manage your ever-growing list of research papers☆13Nov 19, 2023Updated 2 years ago
- Exporting youtube videos using whisper☆17Sep 27, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- PDF to Digital Form using GPT4 Vision API☆17May 17, 2025Updated 10 months ago
- Globot is an agent that controls your browser using playwright and GPT-4V.☆134Jan 4, 2024Updated 2 years ago
- OpenAI-Assistant API integration with Speech Recognition and Eleven Labs TTS. User can choose name, description, model of assistant and …☆18Nov 7, 2023Updated 2 years ago
- Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"☆43Nov 11, 2024Updated last year
- ☆22May 23, 2025Updated 10 months ago
- A chat implementation for FastHTML☆12Sep 14, 2025Updated 6 months ago
- [arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMs☆1,522Aug 19, 2024Updated last year
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist w…☆963Nov 5, 2025Updated 4 months ago
- Chrome extension that enables users to leverage the OpenAI Chat Completions endpoint on any YouTube video.☆34Mar 23, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆63Sep 23, 2024Updated last year
- Vision utilities for web interaction agents 👀☆1,758Nov 25, 2024Updated last year
- ☆17Jun 16, 2025Updated 9 months ago
- Command your browser with GPT☆422Feb 3, 2026Updated last month
- Create browser automation as if you were teaching a human using GPT-4 Vision.☆586Feb 19, 2024Updated 2 years ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆243May 5, 2024Updated last year
- localization☆11Jan 20, 2019Updated 7 years ago
- ☆73Nov 16, 2023Updated 2 years ago
- The decentralized storage application for accelerating AI innovation☆17Apr 9, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆13Mar 5, 2025Updated last year
- Chrome Extension to connect ChatGPT with gmail, and write reply email for incoming emails☆10Feb 4, 2023Updated 3 years ago
- Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation; CoLLAs 2025)☆35Jul 21, 2025Updated 8 months ago
- A simi and fully autonomous AI assistant using ChatGPT 3.5 turbo and gpt4.☆44May 17, 2023Updated 2 years ago
- stream-of-consciousness experience of an AI's thinking process, complete with creative tangents and unexpected connections.☆14Jan 29, 2025Updated last year
- Website for the Open Interpreter project☆32Mar 22, 2024Updated 2 years ago
- Evaluate if a task requires human intervention☆16Jan 1, 2025Updated last year