ππ§ GPT-4 Vision x πͺβ¨οΈ Vimium = Autonomous Web Agent
β166Nov 16, 2023Updated 2 years ago
Alternatives and similar repositories for GPT-V-on-Web
Users that are interested in GPT-V-on-Web are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,059Dec 9, 2024Updated last year
- GPT-4 Vision Chrome Extensionβ108Nov 12, 2023Updated 2 years ago
- Browse the web with GPT-4V and Vimiumβ2,654Sep 25, 2024Updated last year
- Example use cases for the GPT-4 Vision APIβ19Nov 26, 2023Updated 2 years ago
- Web Scraping with GPT-4 Vision API and Puppeteerβ563Jan 31, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Interact privately with your documents using the power of GPT, 100% privately, no data leaksβ10May 22, 2023Updated 3 years ago
- A CLI speech recognition tool, using OpenAI Whisper, supports audio file transcription and near-realtime microphone input.β22Updated this week
- A simple "Be My Eyes" web app with a llama.cpp/llava backendβ495Nov 28, 2023Updated 2 years ago
- Allows dictating anywhere in Windows using AutoHotKey and OpenAI's Whisper speech-to-text engine.β14Feb 21, 2024Updated 2 years ago
- β15Jul 9, 2025Updated 11 months ago
- GPT-4V in Wonderland: LMMs as Smartphone Agentsβ134Jul 17, 2024Updated last year
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β845Feb 3, 2025Updated last year
- Code for paper βLanguage Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Abilityββ15Jun 13, 2023Updated 2 years ago
- Manage your ever-growing list of research papersβ14Nov 19, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- PDF to Digital Form using GPT4 Vision APIβ17Apr 2, 2026Updated 2 months ago
- Globot is an agent that controls your browser using playwright and GPT-4V.β134Jan 4, 2024Updated 2 years ago
- OpenAI-Assistant API integration with Speech Recognition and Eleven Labs TTS. User can choose name, description, model of assistant and β¦β18Nov 7, 2023Updated 2 years ago
- Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"β44Nov 11, 2024Updated last year
- Next.js application for chatting with PDF files, enhancing document interaction and productivity. Tailwind CSS for styling.β10Apr 6, 2024Updated 2 years ago
- Build modern UIs in Jupyter with Pythonβ12Dec 28, 2022Updated 3 years ago
- β22May 23, 2025Updated last year
- A chat implementation for FastHTMLβ12Sep 14, 2025Updated 8 months ago
- [arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMsβ1,540Aug 19, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Chrome extension that enables users to leverage the OpenAI Chat Completions endpoint on any YouTube video.β34Mar 23, 2024Updated 2 years ago
- β20Oct 3, 2022Updated 3 years ago
- β63Sep 23, 2024Updated last year
- β20Mar 20, 2025Updated last year
- A Python-based chat application utilizing a Local LLM to generate complex thought chains for various use cases such as product developmenβ¦β20Feb 18, 2026Updated 3 months ago
- Command your browser with GPTβ421Feb 3, 2026Updated 4 months ago
- Course repository for the Spring 2023 COMP664 course "Deep Learning" at UNCβ14Apr 17, 2023Updated 3 years ago
- Create browser automation as if you were teaching a human using GPT-4 Vision.β584Feb 19, 2024Updated 2 years ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β248May 5, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean β’ AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- localizationβ11Jan 20, 2019Updated 7 years ago
- Vision utilities for web interaction agents πβ1,763Nov 25, 2024Updated last year
- β72Nov 16, 2023Updated 2 years ago
- The decentralized storage application for accelerating AI innovationβ16Apr 9, 2024Updated 2 years ago
- Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation; CoLLAs 2025)β35Jul 21, 2025Updated 10 months ago
- A simi and fully autonomous AI assistant using ChatGPT 3.5 turbo and gpt4.β44May 17, 2023Updated 3 years ago
- stream-of-consciousness experience of an AI's thinking process, complete with creative tangents and unexpected connections.β14Jan 29, 2025Updated last year