THUDM / CogAgentLinks

An open-sourced end-to-end VLM-based GUI Agent

☆992

Alternatives and similar repositories for CogAgent

Users that are interested in CogAgent are comparing it to the libraries listed below

Sorting:

THUDM / AutoWebGLM
An LLM-based Web Navigating Agent (KDD'24)
☆870Updated 9 months ago
Westlake-AGI-Lab / AppAgentX
Official implementation of AppAgentX: Evolving GUI Agents as Proficient Smartphone Users
☆464Updated 2 months ago
showlab / ShowUI
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
☆1,359Updated last month
OpenBMB / AgentCPM-GUI
AgentCPM-GUI: An on-device GUI agent for operating Android apps, enhancing reasoning ability with reinforcement fine-tuning for efficient…
☆896Updated 3 weeks ago
thunlp / ProactiveAgent
A LLM-based Agent that predict its tasks proactively.
☆389Updated last month
SkyworkAI / DeepResearchAgent
☆1,034Updated this week
Alibaba-NLP / ZeroSearch
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
☆1,046Updated last week
niuzaisheng / ScreenAgent
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)
☆473Updated 7 months ago
OpenBMB / UltraRAG
Build & Optimize your RAG.
☆718Updated 2 months ago
OpenBMB / IoA
An open-source framework for collaborative AI agents, enabling diverse, distributed agents to team up and tackle complex tasks through in…
☆731Updated 8 months ago
xlang-ai / aguvis
[ICML2025] Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
☆332Updated 4 months ago
thunlp / LLMxMapReduce
☆762Updated 3 weeks ago
GAIR-NLP / PC-Agent
PC Agent: While You Sleep, AI Works - A Cognitive Journey into Digital World
☆267Updated last month
Alibaba-NLP / ViDoRAG
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
☆511Updated last month
BAI-LAB / MemoryOS
MemoryOS is designed to provide a memory operating system for personalized AI agents.
☆423Updated this week
HumanMLLM / R1-Omni
☆908Updated 3 months ago
microsoft / GUI-Actor
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents
☆286Updated this week
OpenBMB / VisRAG
Parsing-free RAG supported by VLMs
☆752Updated 4 months ago
QwenLM / Qwen3-Embedding
☆1,009Updated last week
zjunlp / OmniThink
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking
☆454Updated 2 months ago
THUDM / WebRL
Building Open LLM Web Agents with Self-Evolving Online Curriculum RL
☆420Updated last month
camel-ai / crab
🦀️ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/
☆352Updated last week
ByteDance-Seed / Seed-Coder
Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.
☆527Updated last month
THUDM / LongCite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
☆501Updated 6 months ago
Alibaba-NLP / OmniSearch
Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
☆349Updated 2 months ago
OpenBMB / ProAgent
An LLM-based Agent for the New Automation Paradigm - Agentic Process Automation
☆851Updated last year
showlab / computer_use_ootb
Out-of-the-box (OOTB) GUI Agent for Windows and macOS
☆1,623Updated last month
Xiao9905 / AutoGLM
☆66Updated 8 months ago
AIDC-AI / Marco-o1
An Open Large Reasoning Model for Real-World Solutions
☆1,504Updated last month
tablegpt / tablegpt-agent
A pre-built agent for TableGPT2.
☆593Updated last month