☆73May 23, 2025Updated 9 months ago
Alternatives and similar repositories for InfiGUIAgent
Users that are interested in InfiGUIAgent are comparing it to the libraries listed below
Sorting:
- Repository for the paper "InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners"☆64Dec 4, 2025Updated 2 months ago
- Official implementation for “HarmonyGuard: Toward Safety and Utility in Web Agents via Adaptive Policy Enhancement and Dual-Objective Opt…☆25Jan 10, 2026Updated last month
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).☆381Aug 16, 2025Updated 6 months ago
- ☆24May 13, 2025Updated 9 months ago
- ☆87Oct 28, 2024Updated last year
- [AAAI 2026 Oral] Official repository for InfiGUI-G1. We introduce Adaptive Exploration Policy Optimization (AEPO) to overcome semantic al…☆136Nov 19, 2025Updated 3 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆180Oct 8, 2025Updated 4 months ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agents☆435Apr 20, 2025Updated 10 months ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 8 months ago
- Open-sourced, Fast and Context-aware Action Grounding from GUI Instructions for GUI/Computer-use Agents☆398Feb 8, 2025Updated last year
- R1-like Computer-use Agent☆89Mar 21, 2025Updated 11 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Aug 20, 2025Updated 6 months ago
- Building a comprehensive and handy list of papers for GUI agents☆641Oct 27, 2025Updated 4 months ago
- Aligning Agentic World Models via Knowledgeable Experience Learning☆31Jan 25, 2026Updated last month
- [NeurIPS 2025] A multimodal agent that can interact with its own PC in a multimodal manner.☆34Nov 10, 2025Updated 3 months ago
- More reliable Video Understanding Evaluation☆14Sep 23, 2025Updated 5 months ago
- ☆14Mar 28, 2024Updated last year
- Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs☆26Nov 7, 2025Updated 3 months ago
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆40Aug 7, 2025Updated 6 months ago
- [TMLR'25] "Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents"☆94Oct 5, 2025Updated 4 months ago
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction☆379Mar 7, 2025Updated 11 months ago
- Code for paper Empowering Large Language Model Agents through Action Learning☆33Aug 8, 2024Updated last year
- PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World☆308May 21, 2025Updated 9 months ago
- Agent-RRM: Exploring Reasoning Reward Model for Agents☆44Feb 4, 2026Updated 3 weeks ago
- ☆25Jan 28, 2026Updated last month
- ☆19Feb 24, 2025Updated last year
- ☆18Oct 14, 2024Updated last year
- ChatGPT-like interface for working with AI Agents☆20Sep 18, 2024Updated last year
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated 11 months ago
- ☆17Jan 9, 2025Updated last year
- AWM: Agent Workflow Memory☆397Dec 22, 2025Updated 2 months ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 6 months ago
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)☆19Jul 1, 2025Updated 8 months ago
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆43Mar 11, 2025Updated 11 months ago
- [ACL 2025] AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant☆44Dec 19, 2024Updated last year
- [ACL 2025] Agentic Knowledgeable Self-awareness☆91Jun 15, 2025Updated 8 months ago
- [ICLR'25] Code for KaSA, an official implementation of "KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models"☆20Jan 16, 2025Updated last year
- Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models☆30Oct 6, 2025Updated 4 months ago
- [ICLR 2026] Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing☆29Feb 6, 2026Updated 3 weeks ago