AriaUI / Aria-UI
Open-sourced, Fast and Context-aware Action Grounding from GUI Instructions for GUI/Computer-use Agents
☆340Updated last month
Alternatives and similar repositories for Aria-UI:
Users that are interested in Aria-UI are comparing it to the libraries listed below
- "GraphAgent: Agentic Graph Language Assistant"☆284Updated last month
- Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction☆265Updated 2 weeks ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agents☆305Updated last month
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆193Updated this week
- "VideoRAG: Retrieval-Augmented Generation with Extreme Long-Context Videos"☆490Updated 3 weeks ago
- [ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agents☆198Updated last month
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RL☆330Updated last month
- Recipes to train the self-rewarding reasoning LLMs.☆207Updated 3 weeks ago
- The model, data and code for the visual GUI Agent SeeClick☆339Updated 4 months ago
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".☆229Updated last month
- 🦀️ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/☆316Updated 4 months ago
- 🌐 WebWalker: Benchmarking LLMs in Web Traversal☆376Updated last week
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.☆1,117Updated last week
- Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆116Updated 2 weeks ago
- GUI Grounding for Professional High-Resolution Computer Use☆149Updated last month
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆135Updated this week
- An open-sourced end-to-end VLM-based GUI Agent☆837Updated last month
- ☆197Updated 4 months ago
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆112Updated last week
- An open platform for enhancing the capability of LLMs in workflow orchestration.☆122Updated 2 weeks ago
- AN O1 REPLICATION FOR CODING☆329Updated 3 months ago
- 🤠 Agent-as-a-Judge and DevAI dataset☆375Updated 2 months ago
- ☆54Updated last month
- A LLM-based Agent that predict its tasks proactively.☆332Updated this week
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.☆299Updated last week
- DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought☆210Updated 2 months ago
- "Your Fully-Automated Personal AI Assistant, and Open-Source & Cost-Efficient Alternative to OpenAI's Deep Research"☆813Updated last month
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆133Updated last month
- Official implementation of AppAgentX: Evolving GUI Agents as Proficient Smartphone Users☆249Updated 2 weeks ago
- AndroidWorld is an environment and benchmark for autonomous agents☆248Updated this week