ππ§ GPT-4 Vision x πͺβ¨οΈ Vimium = Autonomous Web Agent
β165Nov 16, 2023Updated 2 years ago
Alternatives and similar repositories for GPT-V-on-Web
Users that are interested in GPT-V-on-Web are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- GPT-4 Vision Chrome Extensionβ108Nov 12, 2023Updated 2 years ago
- Browse the web with GPT-4V and Vimiumβ2,662Sep 25, 2024Updated last year
- Example use cases for the GPT-4 Vision APIβ19Nov 26, 2023Updated 2 years ago
- [ICML 2024] Self-Infilling Code Generationβ18May 5, 2024Updated last year
- Web Scraping with GPT-4 Vision API and Puppeteerβ560Jan 31, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Interact privately with your documents using the power of GPT, 100% privately, no data leaksβ10May 22, 2023Updated 2 years ago
- A CLI speech recognition tool, using OpenAI Whisper, supports audio file transcription and near-realtime microphone input.β22Updated this week
- A simple "Be My Eyes" web app with a llama.cpp/llava backendβ494Nov 28, 2023Updated 2 years ago
- β15Feb 17, 2024Updated 2 years ago
- Allows dictating anywhere in Windows using AutoHotKey and OpenAI's Whisper speech-to-text engine.β13Feb 21, 2024Updated 2 years ago
- Anthropic MCP client for macOSβ16Jan 5, 2025Updated last year
- GPT-4V in Wonderland: LMMs as Smartphone Agentsβ134Jul 17, 2024Updated last year
- Desktop AI Data Scraperβ155Oct 10, 2023Updated 2 years ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β844Feb 3, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial β’ AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for paper βLanguage Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Abilityββ15Jun 13, 2023Updated 2 years ago
- David Attenborough stops you from slouchingβ51Nov 15, 2023Updated 2 years ago
- Exporting youtube videos using whisperβ17Sep 27, 2022Updated 3 years ago
- PDF to Digital Form using GPT4 Vision APIβ17Apr 2, 2026Updated 2 weeks ago
- Globot is an agent that controls your browser using playwright and GPT-4V.β134Jan 4, 2024Updated 2 years ago
- OpenAI-Assistant API integration with Speech Recognition and Eleven Labs TTS. User can choose name, description, model of assistant and β¦β18Nov 7, 2023Updated 2 years ago
- Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"β43Nov 11, 2024Updated last year
- Next.js application for chatting with PDF files, enhancing document interaction and productivity. Tailwind CSS for styling.β10Apr 6, 2024Updated 2 years ago
- Build modern UIs in Jupyter with Pythonβ12Dec 28, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMsβ1,525Aug 19, 2024Updated last year
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β975Nov 5, 2025Updated 5 months ago
- Chrome extension that enables users to leverage the OpenAI Chat Completions endpoint on any YouTube video.β34Mar 23, 2024Updated 2 years ago
- A working Speech to Speech AI assistant that can interact with you, manage your system, and more!β14May 1, 2024Updated last year
- β20Oct 3, 2022Updated 3 years ago
- β63Sep 23, 2024Updated last year
- A Python-based chat application utilizing a Local LLM to generate complex thought chains for various use cases such as product developmenβ¦β20Feb 18, 2026Updated 2 months ago
- Using multiple LLMs for ensemble Forecastingβ16Jan 17, 2024Updated 2 years ago
- Vision utilities for web interaction agents πβ1,759Nov 25, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Command your browser with GPTβ422Feb 3, 2026Updated 2 months ago
- Create browser automation as if you were teaching a human using GPT-4 Vision.β586Feb 19, 2024Updated 2 years ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β246May 5, 2024Updated last year
- β18Apr 3, 2023Updated 3 years ago
- β72Nov 16, 2023Updated 2 years ago
- The decentralized storage application for accelerating AI innovationβ15Apr 9, 2024Updated 2 years ago
- Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation; CoLLAs 2025)β35Jul 21, 2025Updated 8 months ago