ππ§ GPT-4 Vision x πͺβ¨οΈ Vimium = Autonomous Web Agent
β166Nov 16, 2023Updated 2 years ago
Alternatives and similar repositories for GPT-V-on-Web
Users that are interested in GPT-V-on-Web are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UIβ1,059Dec 9, 2024Updated last year
- GPT-4 Vision Chrome Extensionβ108Nov 12, 2023Updated 2 years ago
- Browse the web with GPT-4V and Vimiumβ2,652Sep 25, 2024Updated last year
- Example use cases for the GPT-4 Vision APIβ19Nov 26, 2023Updated 2 years ago
- Web Scraping with GPT-4 Vision API and Puppeteerβ563Jan 31, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits β’ AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Interact privately with your documents using the power of GPT, 100% privately, no data leaksβ10May 22, 2023Updated 3 years ago
- A simple "Be My Eyes" web app with a llama.cpp/llava backendβ496Nov 28, 2023Updated 2 years ago
- Anthropic MCP client for macOSβ16Jan 5, 2025Updated last year
- GPT-4V in Wonderland: LMMs as Smartphone Agentsβ134Jul 17, 2024Updated last year
- Allows issuing voice commands in Windows via AutoHotKey scripts generated by ChatGPT.β14Jan 5, 2025Updated last year
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multβ¦β849Feb 3, 2025Updated last year
- Exporting youtube videos using whisperβ17Sep 27, 2022Updated 3 years ago
- Desktop AI Data Scraperβ155Oct 10, 2023Updated 2 years ago
- PDF to Digital Form using GPT4 Vision APIβ17Apr 2, 2026Updated 2 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Globot is an agent that controls your browser using playwright and GPT-4V.β134Jan 4, 2024Updated 2 years ago
- Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"β44Nov 11, 2024Updated last year
- β22May 23, 2025Updated last year
- A chat implementation for FastHTMLβ12Sep 14, 2025Updated 9 months ago
- [arXiv 2023] Set-of-Mark Prompting for GPT-4V and LMMsβ1,544Aug 19, 2024Updated last year
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist wβ¦β1,003Nov 5, 2025Updated 7 months ago
- Object recognition with Pepper using a deep learning modelβ10Sep 16, 2021Updated 4 years ago
- A working Speech to Speech AI assistant that can interact with you, manage your system, and more!β14May 1, 2024Updated 2 years ago
- β20Oct 3, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways β’ AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- β63Sep 23, 2024Updated last year
- β20Mar 20, 2025Updated last year
- Using multiple LLMs for ensemble Forecastingβ16Jan 17, 2024Updated 2 years ago
- A Python-based chat application utilizing a Local LLM to generate complex thought chains for various use cases such as product developmenβ¦β20Feb 18, 2026Updated 4 months ago
- Command your browser with GPTβ420Feb 3, 2026Updated 4 months ago
- Create browser automation as if you were teaching a human using GPT-4 Vision.β585Feb 19, 2024Updated 2 years ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β250May 5, 2024Updated 2 years ago
- Summaries of machine learning papersβ12Aug 19, 2022Updated 3 years ago
- localizationβ11Jan 20, 2019Updated 7 years ago
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- β18Apr 3, 2023Updated 3 years ago
- Vision utilities for web interaction agents πβ1,763Nov 25, 2024Updated last year
- β72Nov 16, 2023Updated 2 years ago
- β13Mar 5, 2025Updated last year
- Benchmarking Mobile Device Control Agents across Diverse Configurations (ICLR 2024 workshop GenAI4DM spotlight presentation; CoLLAs 2025)β35Jul 21, 2025Updated 11 months ago
- stream-of-consciousness experience of an AI's thinking process, complete with creative tangents and unexpected connections.β14Jan 29, 2025Updated last year
- A web-based tool that utilizes GPT-4's vision capabilities to analyze and describe system architecture diagrams, providing instant insighβ¦β17Nov 9, 2023Updated 2 years ago