ππ§ GPT-4 Vision x πͺβ¨οΈ Vimium = Autonomous Web Agent
β166Nov 16, 2023Updated 2 years ago
Alternatives and similar repositories for GPT-V-on-Web
Users that are interested in GPT-V-on-Web are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,061Dec 9, 2024Updated last year
- GPT-4 Vision Chrome Extensionβ108Nov 12, 2023Updated 2 years ago
- Browse the web with GPT-4V and Vimiumβ2,659Sep 25, 2024Updated last year
- Example use cases for the GPT-4 Vision APIβ19Nov 26, 2023Updated 2 years ago
- [ICML 2024] Self-Infilling Code Generationβ18May 5, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean β’ AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Web Scraping with GPT-4 Vision API and Puppeteerβ563Jan 31, 2024Updated 2 years ago
- Interact privately with your documents using the power of GPT, 100% privately, no data leaksβ10May 22, 2023Updated 2 years ago
- A CLI speech recognition tool, using OpenAI Whisper, supports audio file transcription and near-realtime microphone input.β22Updated this week
- Allows dictating anywhere in Windows using AutoHotKey and OpenAI's Whisper speech-to-text engine.β14Feb 21, 2024Updated 2 years ago
- GPT-4V in Wonderland: LMMs as Smartphone Agentsβ134Jul 17, 2024Updated last year
- Allows issuing voice commands in Windows via AutoHotKey scripts generated by ChatGPT.β14Jan 5, 2025Updated last year
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β846Feb 3, 2025Updated last year
- Code for paper βLanguage Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Abilityββ15Jun 13, 2023Updated 2 years ago
- David Attenborough stops you from slouchingβ51Nov 15, 2023Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Exporting youtube videos using whisperβ17Sep 27, 2022Updated 3 years ago
- Desktop AI Data Scraperβ155Oct 10, 2023Updated 2 years ago
- PDF to Digital Form using GPT4 Vision APIβ17Apr 2, 2026Updated last month
- Globot is an agent that controls your browser using playwright and GPT-4V.β134Jan 4, 2024Updated 2 years ago
- OpenAI-Assistant API integration with Speech Recognition and Eleven Labs TTS. User can choose name, description, model of assistant and β¦β18Nov 7, 2023Updated 2 years ago
- Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"β43Nov 11, 2024Updated last year
- Build modern UIs in Jupyter with Pythonβ12Dec 28, 2022Updated 3 years ago
- β22May 23, 2025Updated 11 months ago
- [arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMsβ1,533Aug 19, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β991Nov 5, 2025Updated 6 months ago
- Chrome extension that enables users to leverage the OpenAI Chat Completions endpoint on any YouTube video.β34Mar 23, 2024Updated 2 years ago
- β20Oct 3, 2022Updated 3 years ago
- Using multiple LLMs for ensemble Forecastingβ16Jan 17, 2024Updated 2 years ago
- A Python-based chat application utilizing a Local LLM to generate complex thought chains for various use cases such as product developmenβ¦β20Feb 18, 2026Updated 3 months ago
- Create browser automation as if you were teaching a human using GPT-4 Vision.β584Feb 19, 2024Updated 2 years ago
- Summaries of machine learning papersβ12Aug 19, 2022Updated 3 years ago
- β18Apr 3, 2023Updated 3 years ago
- Vision utilities for web interaction agents πβ1,761Nov 25, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- β72Nov 16, 2023Updated 2 years ago
- β13Mar 5, 2025Updated last year
- The decentralized storage application for accelerating AI innovationβ16Apr 9, 2024Updated 2 years ago
- Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation; CoLLAs 2025)β35Jul 21, 2025Updated 9 months ago
- A simi and fully autonomous AI assistant using ChatGPT 3.5 turbo and gpt4.β44May 17, 2023Updated 3 years ago
- stream-of-consciousness experience of an AI's thinking process, complete with creative tangents and unexpected connections.β14Jan 29, 2025Updated last year
- A daemon that makes a desktop OS accessible to AI agentsβ40May 29, 2025Updated 11 months ago