AgentCPM-GUI: An on-device GUI agent for operating Android apps, enhancing reasoning ability with reinforcement fine-tuning for efficient task execution.
β1,314Jan 11, 2026Updated last month
Alternatives and similar repositories for AgentCPM-GUI
Users that are interested in AgentCPM-GUI are comparing it to the libraries listed below
Sorting:
- [NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agentsβ385Feb 11, 2026Updated 2 weeks ago
- Automate your mobile devices with natural language commands - an LLM agnostic mobile Agent π€β7,766Feb 20, 2026Updated last week
- Mobile-Agent: The Powerful GUI Agent Familyβ7,338Updated this week
- Pioneering Automated GUI Interaction with Native Agentsβ9,631Jan 27, 2026Updated last month
- An open-sourced end-to-end VLM-based GUI Agentβ1,138Apr 4, 2025Updated 10 months ago
- Official implementation of AppAgentX: Evolving GUI Agents as Proficient Smartphone Usersβ605Apr 15, 2025Updated 10 months ago
- AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.β6,545Mar 19, 2025Updated 11 months ago
- [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.β1,719Jan 20, 2026Updated last month
- A simple agent framework that's capable of browser use + mcp + auto instrument + plan + deep research + moreβ385Jan 5, 2026Updated last month
- A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phoneβ23,942Updated this week
- Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving statβ¦β1,548Jun 14, 2025Updated 8 months ago
- The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infraβ28,128Jan 14, 2026Updated last month
- β39Aug 6, 2025Updated 6 months ago
- MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235Bβ1,669Feb 10, 2026Updated 2 weeks ago
- An open-source SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skills and subagents,β¦β20,843Updated this week
- Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.β13,349Feb 16, 2026Updated last week
- [ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interactionβ379Mar 7, 2025Updated 11 months ago
- Driving all platforms UI automation with vision-based modelβ11,734Updated this week
- A simple screen parsing tool towards pure vision based GUI agentβ24,406Sep 12, 2025Updated 5 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesisβ179Oct 8, 2025Updated 4 months ago
- β23Jan 28, 2026Updated 3 weeks ago
- Secretary is an AI-powered tool that analyzes social media content from specified accounts and delivers results via WeChat. It supports cβ¦β360Aug 4, 2025Updated 6 months ago
- π» A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.β1,115Aug 17, 2025Updated 6 months ago
- γMobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operationγβ134Feb 2, 2026Updated 3 weeks ago
- β301Aug 18, 2025Updated 6 months ago
- FlowGram is an extensible workflow development framework with built-in canvas, form, variable, and materials that helps developers build β¦β7,702Feb 11, 2026Updated 2 weeks ago
- Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.β110Jul 27, 2025Updated 7 months ago
- The first open-source agent skills builder. Define skills by vibe workflow, run on Claude Code, Cursor, Codex & more. Build Clawdbot π¦Β· β¦β6,648Feb 12, 2026Updated 2 weeks ago
- Kortix β build, manage and train AI Agents.β19,418Updated this week
- βοΈ Create and run workflows (RPA 2.0)β3,889Feb 20, 2026Updated last week
- Building a comprehensive and handy list of papers for GUI agentsβ636Oct 27, 2025Updated 4 months ago
- β¨β¨[NeurIPS 2025] VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interactionβ2,490Mar 28, 2025Updated 11 months ago
- [AAAI-2026] Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"β146Nov 24, 2025Updated 3 months ago
- Agent S: an open agentic framework that uses computers like a humanβ9,853Feb 21, 2026Updated last week
- [TMLR] LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospectsβ158Dec 2, 2025Updated 2 months ago
- OS-ATLAS: A Foundation Action Model For Generalist GUI Agentsβ435Apr 20, 2025Updated 10 months ago
- ZeroSearch: Incentivize the Search Capability of LLMs without Searchingβ1,241Aug 16, 2025Updated 6 months ago
- π Make websites accessible for AI agents. Automate tasks online with ease.β79,028Updated this week
- A curated collection of resources, tools, and frameworks for developing GUI Agents.β306Feb 11, 2026Updated 2 weeks ago