OSU-NLP-Group / Online-Mind2WebView external linksLinks
An Illusion of Progress? Assessing the Current State of Web Agents
☆143Jan 2, 2026Updated last month
Alternatives and similar repositories for Online-Mind2Web
Users that are interested in Online-Mind2Web are comparing it to the libraries listed below
Sorting:
- [NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆98Dec 18, 2025Updated last month
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- [ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents☆26Oct 15, 2025Updated 3 months ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 7 months ago
- Building a comprehensive and handy list of papers for GUI agents☆628Oct 27, 2025Updated 3 months ago
- [ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large mult…☆822Feb 3, 2025Updated last year
- ☆31Aug 17, 2025Updated 5 months ago
- rmp data ranking☆13Nov 4, 2025Updated 3 months ago
- [ICLR2025 Spotlight] Agent Trajectory Synthesis via Guiding Replay with Web Tutorials☆50Feb 21, 2025Updated 11 months ago
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆299Jul 18, 2025Updated 6 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated last year
- ☆18Jan 3, 2025Updated last year
- Code repository for the AISTATS 2021 paper "Towards Understanding the Optimal Behaviors of Deep Active Learning Algorithms"☆15Mar 20, 2021Updated 4 years ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆40Aug 7, 2025Updated 6 months ago
- [NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist w…☆946Nov 5, 2025Updated 3 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Dec 29, 2024Updated last year
- [ACL 2025] GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent☆58May 28, 2025Updated 8 months ago
- Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"☆1,020Mar 4, 2024Updated last year
- Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"☆1,327Nov 26, 2025Updated 2 months ago
- ☆174Oct 31, 2025Updated 3 months ago
- AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and re…☆512Updated this week
- ☆25May 28, 2025Updated 8 months ago
- True Few-Shot BioIE: Benchmarking GPT-3 In-Context and Small PLM Fine-Tuning☆12Jul 6, 2022Updated 3 years ago
- Public experimental example code for the ProPublic recidivism data-based experiments for the upcoming Interpretable Active Learning Paper☆10Dec 18, 2017Updated 8 years ago
- [CVPR 2025] GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration☆20Mar 21, 2025Updated 10 months ago
- [CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".☆20Jun 16, 2025Updated 7 months ago
- [CIKM'24] Reviving the Context: Camera Trap Species Classification as Link Prediction on Multimodal Knowledge Graphs☆12Apr 2, 2025Updated 10 months ago
- [ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery☆124Aug 26, 2025Updated 5 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆256Apr 24, 2025Updated 9 months ago
- WebLINX is a benchmark for building web navigation agents with conversational capabilities☆159Feb 11, 2025Updated last year
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆40Mar 31, 2025Updated 10 months ago
- Code for the paper "Trust the PRoC3S: Solving Long-Horizon Robotics Problems with LLMs and Constraint Satisfaction" presented at CoRL 202…☆31Nov 18, 2024Updated last year
- 2018研究生推免计算机类高校夏令营时间安排☆12May 14, 2018Updated 7 years ago
- ☆37May 28, 2025Updated 8 months ago
- The model, data and code for the visual GUI Agent SeeClick☆463Jul 13, 2025Updated 7 months ago
- [EMNLP 2022] Language Model Pre-Training with Sparse Latent Typing☆14Feb 10, 2023Updated 3 years ago
- A repository for a universal I/O spec for TAMP, along with scripts to convert from popular specs to our spec☆13Jun 25, 2025Updated 7 months ago
- Source code for SIGIR 2022 paper.☆16Apr 25, 2022Updated 3 years ago