MetaAgentX / OpenCaptchaWorldLinks
[NeurIPS 2025] The first web-based benchmark and platform to evaluate visual reasoning and interaction capabilities of MLLM powered agents through diverse and dynamic CAPTCHA puzzles.
☆56Updated last month
Alternatives and similar repositories for OpenCaptchaWorld
Users that are interested in OpenCaptchaWorld are comparing it to the libraries listed below
Sorting:
- ☆73Updated 8 months ago
- ☆254Updated last week
- [ICLR 2026] Efficient Agent Training for Computer Use☆135Updated 5 months ago
- ☆82Updated 10 months ago
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆107Updated 6 months ago
- Test-time preferenece optimization (ICML 2025).☆178Updated 8 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆177Updated 3 months ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆120Updated 8 months ago
- [AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615☆58Updated 2 months ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆306Updated 3 months ago
- ✨✨Latest Papers and Datasets on Mobile and PC GUI Agent☆149Updated last year
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆297Updated 6 months ago
- [FSE'2026] SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆144Updated last week
- AgenTracer: A Lightweight Failure Attributor for Agentic Systems☆74Updated 2 months ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆253Updated 5 months ago
- [NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.☆36Updated 2 months ago
- ☆180Updated 9 months ago
- ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization☆95Updated 8 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction☆379Updated 11 months ago
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆71Updated 8 months ago
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…☆13Updated 6 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning☆97Updated 11 months ago
- ☆192Updated 3 months ago
- ☆122Updated 4 months ago
- ☆290Updated 5 months ago
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆298Updated this week
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆44Updated last year
- [NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents☆376Updated 3 months ago
- Official implementation of UI-Ins: Enhancing GUI Grounding with Multi-Perspective Instruction-as-Reasoning☆63Updated last month
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆104Updated 4 months ago