AriaUI / Aria-UI
Open-sourced, Fast and Context-aware Action Grounding from GUI Instructions for GUI/Computer-use Agents
☆320Updated last week
Alternatives and similar repositories for Aria-UI:
Users that are interested in Aria-UI are comparing it to the libraries listed below
- "GraphAgent: Agentic Graph Language Assistant"☆258Updated last week
- Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction☆221Updated last month
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agents☆279Updated this week
- [ICLR'25 Oral] UGround: Universal GUI Visual Grounding for GUI Agents☆168Updated this week
- An open platform for enhancing the capability of LLMs in workflow orchestration.☆98Updated 2 months ago
- "MiniRAG: Making RAG Simpler with Small and Free Language Models"☆714Updated last week
- Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.☆986Updated last week
- An open-sourced end-to-end VLM-based GUI Agent☆753Updated this week
- Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆103Updated 3 weeks ago
- Build multimodal language agents for fast prototype and production☆1,777Updated this week
- ☆188Updated 2 months ago
- The model, data and code for the visual GUI Agent SeeClick☆312Updated 2 months ago
- GitHub page for "Large Language Model-Brained GUI Agents: A Survey"☆112Updated 2 weeks ago
- GUI Grounding for Professional High-Resolution Computer Use☆89Updated last month
- [ICLR 2025] A trinity of environments, tools, and benchmarks for general virtual agents☆192Updated 2 weeks ago
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RL☆302Updated last week
- The official implementation of Self-Play Preference Optimization (SPPO)☆481Updated 3 weeks ago
- ☆216Updated 2 months ago
- AndroidWorld is an environment and benchmark for autonomous agents☆212Updated last week
- Medical o1, Towards medical complex reasoning with LLMs☆868Updated last month
- This is a collection of resources for computer-use GUI agents, including videos, blogs, papers, and projects.☆229Updated this week
- FlexRAG: A RAG Framework for Information Retrieval and Generation.☆121Updated this week
- A LLM-based Agent that predict its tasks proactively.☆299Updated last month
- ☆52Updated 2 weeks ago
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆125Updated 2 months ago
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".☆201Updated this week
- Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasonin…☆155Updated 2 months ago