OSU-NLP-Group / Online-Mind2WebLinks
An Illusion of Progress? Assessing the Current State of Web Agents
☆66Updated last month
Alternatives and similar repositories for Online-Mind2Web
Users that are interested in Online-Mind2Web are comparing it to the libraries listed below
Sorting:
- "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆77Updated 2 months ago
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆138Updated 7 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆144Updated 7 months ago
- [ICLR2025 Spotlight] Agent Trajectory Synthesis via Guiding Replay with Web Tutorials☆33Updated 4 months ago
- GenRM-CoT: Data release for verification rationales☆61Updated 8 months ago
- ☆68Updated 3 months ago
- GUICourse: From General Vision Langauge Models to Versatile GUI Agents☆117Updated 11 months ago
- Repo of paper "Free Process Rewards without Process Labels"☆153Updated 3 months ago
- ☆157Updated 3 weeks ago
- ☆169Updated this week
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracy☆63Updated 6 months ago
- ☆190Updated 2 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆78Updated 3 weeks ago
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆254Updated 3 weeks ago
- Towards Large Multimodal Models as Visual Foundation Agents☆216Updated 2 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆97Updated 4 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆58Updated last year
- (ICLR 2025) The Official Code Repository for GUI-World.☆59Updated 6 months ago
- Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆79Updated last week
- ☆19Updated last month
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆136Updated last week
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆82Updated last month
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆121Updated 9 months ago
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆219Updated last month
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆215Updated last month
- A version of verl to support tool use☆251Updated last week
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆37Updated last week
- RL Scaling and Test-Time Scaling (ICML'25)☆106Updated 5 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains☆141Updated 2 weeks ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆48Updated 7 months ago