inclusionAI / UI-VenusLinks
UI-Venus is a native UI agent based on the Qwen2.5-VL multimodal large language model, designed to perform precise GUI element grounding and effective navigation using only screenshots as input.
☆133Updated last week
Alternatives and similar repositories for UI-Venus
Users that are interested in UI-Venus are comparing it to the libraries listed below
Sorting:
- Open-sourced, Fast and Context-aware Action Grounding from GUI Instructions for GUI/Computer-use Agents☆373Updated 6 months ago
- When Agent Becomes the Scientist – Building Closed-Loop System from Hypothesis to Verification☆487Updated 2 weeks ago
- Mirix is a multi-agent personal assistant designed to track on-screen activities and answer user questions intelligently. By capturing re…☆1,185Updated this week
- "Vimo: Chat with Your Videos"☆1,005Updated this week
- ☆384Updated this week
- [EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (Verifier-Powered RLVR for Search)☆612Updated 3 weeks ago
- Efficient Reasoning Vision Language Models☆356Updated this week
- Think Beyond Images☆235Updated last week
- Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI (Kunlun Inc.), specializing in vision-language reasoning.☆2,942Updated 3 weeks ago
- OpenCUA: Open Foundations for Computer-Use Agents☆274Updated this week
- 🚀 EvoAgentX: Building a Self-Evolving Ecosystem of AI Agents☆1,167Updated this week
- ☆507Updated this week
- ☆409Updated last month
- Build multimodal language agents for fast prototype and production☆2,542Updated 5 months ago
- ☆396Updated this week
- QwQ is the reasoning model series developed by Qwen team, Alibaba Cloud.☆517Updated 4 months ago
- Babel - Open Multilingual Large Language Models Serving Over 90% of Global Speakers☆209Updated 5 months ago
- "DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"☆1,077Updated 2 weeks ago
- GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents☆321Updated 2 weeks ago
- DocAgent is a system designed to generate high-quality, context-aware code documentation for Python codebases using a multi-agent approac…☆310Updated 4 months ago
- ✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning☆250Updated 3 months ago
- [ICLR Workshop 2025] An official source code for paper "GuardReasoner: Towards Reasoning-based LLM Safeguards".☆153Updated 3 months ago
- This repository contains the implementation of AutoSchemaKG, a novel framework for automatic knowledge graph construction that combines s…☆461Updated 2 weeks ago
- (ICML'25 Outstanding) CollabLLM: From Passive Responders to Active Collaborators☆200Updated 2 weeks ago
- This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).☆339Updated last week
- Train your Agent model via our easy and efficient framework☆1,356Updated last week
- Official implementation of "SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience"☆167Updated 2 weeks ago
- (ACL-2025 main conference) SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automat…☆283Updated 2 months ago
- ☆440Updated this week
- ☆297Updated this week