[ACL'25 (Findings)] Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
☆26Feb 17, 2026Updated last week
Alternatives and similar repositories for Explorer
Users that are interested in Explorer are comparing it to the libraries listed below
Sorting:
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated last month
- ☆32Aug 17, 2025Updated 6 months ago
- TopViewRS: Vision-Language Models as Top-View Spatial Reasoners (EMNLP 2024 Oral)☆15Jun 14, 2025Updated 8 months ago
- [ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation☆28Feb 25, 2025Updated last year
- ☆21May 3, 2025Updated 9 months ago
- ☆19Mar 10, 2025Updated 11 months ago
- Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges☆28May 14, 2025Updated 9 months ago
- [CVPR 2025] Offical implementation of the paper "Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters The…☆31Feb 27, 2025Updated last year
- Sys2Bench is a benchmarking suite designed to evaluate reasoning and planning capabilities of large language models across algorithmic, l…☆29Mar 5, 2025Updated 11 months ago
- A Text2SQL benchmark for evaluation of Large Language Models☆41Updated this week
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Aug 20, 2025Updated 6 months ago
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 8 months ago
- An Illusion of Progress? Assessing the Current State of Web Agents☆146Jan 2, 2026Updated last month
- [NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge☆100Updated this week
- AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents☆37Oct 7, 2025Updated 4 months ago
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆40Mar 31, 2025Updated 11 months ago
- [NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.☆34Nov 10, 2025Updated 3 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆40Aug 7, 2025Updated 6 months ago
- Sotopia-RL: Reward Design for Social Intelligence☆46Jan 29, 2026Updated last month
- [NeurIPS ENLSP Workshop'24] CSKV: Training-Efficient Channel Shrinking for KV Cache in Long-Context Scenarios☆16Oct 18, 2024Updated last year
- [ICML 2025] Official resources of "KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search".☆35Dec 6, 2025Updated 2 months ago
- ☆18Jun 10, 2025Updated 8 months ago
- ☆23Feb 4, 2026Updated 3 weeks ago
- [NeurIPS 2025] VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning☆74Dec 14, 2025Updated 2 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆30Updated this week
- ☆11Jun 22, 2025Updated 8 months ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆29Oct 23, 2025Updated 4 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆37Dec 29, 2024Updated last year
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆30Oct 2, 2025Updated 4 months ago
- ☆36Dec 20, 2024Updated last year
- [CVPR 2025] DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval☆21Jun 23, 2025Updated 8 months ago
- Our repo containes a Efficient RGB-D features extractor to category-level and instance-level 6D pose estimation.☆14Oct 29, 2025Updated 4 months ago
- Continuous Pipelined Speculative Decoding☆16Jan 4, 2026Updated last month
- Official Implementation of HIMA (COLM'25)☆19Nov 25, 2025Updated 3 months ago
- ☆13Feb 2, 2025Updated last year
- A Practical Zoom-in GUI Grounding and Behavior-Based Evaluation method.☆19Dec 8, 2025Updated 2 months ago
- ☆25Aug 19, 2025Updated 6 months ago