An Illusion of Progress? Assessing the Current State of Web Agents
☆147Jan 2, 2026Updated 2 months ago
Alternatives and similar repositories for Online-Mind2Web
Users that are interested in Online-Mind2Web are comparing it to the libraries listed below
Sorting:
- [NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆102Feb 28, 2026Updated last week
- [ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents☆26Feb 17, 2026Updated 3 weeks ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- Building a comprehensive and handy list of papers for GUI agents☆642Oct 27, 2025Updated 4 months ago
- ☆32Aug 17, 2025Updated 6 months ago
- [ICLR2025 Spotlight] Agent Trajectory Synthesis via Guiding Replay with Web Tutorials☆53Feb 21, 2025Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆302Jul 18, 2025Updated 7 months ago
- ☆18Jan 3, 2025Updated last year
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆48Feb 27, 2025Updated last year
- Code repository for the AISTATS 2021 paper "Towards Understanding the Optimal Behaviors of Deep Active Learning Algorithms"☆15Mar 20, 2021Updated 4 years ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆40Aug 7, 2025Updated 7 months ago
- This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box☆18Dec 19, 2024Updated last year
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Dec 29, 2024Updated last year
- [ACL 2025] GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent☆59May 28, 2025Updated 9 months ago
- 🌎💪 BrowserGym, a Gym environment for web task automation☆1,140Feb 10, 2026Updated 3 weeks ago
- [TMLR'25] "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆95Oct 5, 2025Updated 5 months ago
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"☆1,372Nov 26, 2025Updated 3 months ago
- ☆25May 28, 2025Updated 9 months ago
- ☆178Oct 31, 2025Updated 4 months ago
- True Few-Shot BioIE: Benchmarking GPT-3 In-Context and Small PLM Fine-Tuning☆12Jul 6, 2022Updated 3 years ago
- [CVPR 2025] GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration☆20Mar 21, 2025Updated 11 months ago
- [CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".☆20Jun 16, 2025Updated 8 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆256Apr 24, 2025Updated 10 months ago
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆125Updated this week
- WebLINX is a benchmark for building web navigation agents with conversational capabilities☆160Feb 11, 2025Updated last year
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- Code for the paper "Trust the PRoC3S: Solving Long-Horizon Robotics Problems with LLMs and Constraint Satisfaction" presented at CoRL 202…☆31Nov 18, 2024Updated last year
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆40Mar 31, 2025Updated 11 months ago
- ☆37May 28, 2025Updated 9 months ago
- A Declarative Language for Expressing Partial World Knowledge to Reinforcement Learning Agents☆16Jan 19, 2024Updated 2 years ago
- 2018研究生推免计算机类高校夏令营时间安排☆12May 14, 2018Updated 7 years ago
- Nano Banana Studio: AI-Powered Marketing Asset Creator with Real-Time Brand Enhancement☆39Sep 10, 2025Updated 5 months ago
- The model, data and code for the visual GUI Agent SeeClick☆469Jul 13, 2025Updated 7 months ago
- [CVPR 2025] DocLayLLM: An Efficient Multi-modal Extension of Large Language Models for Text-rich Document Understanding☆27Dec 18, 2025Updated 2 months ago
- A repository for a universal I/O spec for TAMP, along with scripts to convert from popular specs to our spec☆13Jun 25, 2025Updated 8 months ago
- Code for the CIKM'23 paper "A Retrieve-and-Read Framework for Knowledge Graph Link Prediction"☆12Mar 23, 2025Updated 11 months ago
- SkillWeaver is a framework to enable web agent self-improvement through environment exploration and skill synthesis.☆114Apr 14, 2025Updated 10 months ago
- EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560☆58Feb 28, 2025Updated last year