niuzaisheng / ScreenAgent
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)
☆303Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for ScreenAgent
- The model, data and code for the visual GUI Agent SeeClick☆216Updated 2 months ago
- Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi e…☆346Updated last month
- 💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.☆178Updated 2 weeks ago
- ☆192Updated 6 months ago
- Environments, tools, and benchmarks for general computer agents☆171Updated 2 weeks ago
- Official implementation of paper "Cumulative Reasoning With Large Language Models" (https://arxiv.org/abs/2308.04371)☆286Updated last month
- [NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models☆532Updated last week
- ☆72Updated 10 months ago
- Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)☆196Updated 3 months ago
- An LLM-based Web Navigating Agent (KDD'24)☆734Updated last month
- [ACL 2024] AUTOACT: Automatic Agent Learning from Scratch for QA via Self-Planning☆177Updated last month
- VisualWebArena is a benchmark for multimodal agents.☆236Updated this week
- Building Open LLM Web Agents with Self-Evolving Online Curriculum RL☆147Updated this week
- ☆339Updated last month
- ControlLLM: Augment Language Models with Tools by Searching on Graphs☆186Updated 3 months ago
- A web client for ScreenAgent: Let Large Models Control Your Desktop☆27Updated 2 months ago
- This is the official repo for "PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization". PromptAgen…☆199Updated 3 months ago
- Official implementation of paper "On the Diagram of Thought" (https://arxiv.org/abs/2409.10038)☆169Updated last month
- Parsing-free RAG supported by VLMs☆335Updated this week
- CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/☆187Updated this week
- Official implementation of "DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning" in ICML'24☆126Updated this week
- ☆194Updated 11 months ago
- ☆116Updated 5 months ago
- ☆117Updated last week
- KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents☆171Updated 2 weeks ago
- FireAct: Toward Language Agent Fine-tuning☆254Updated last year
- ☆488Updated 3 weeks ago
- [ACL2024] T-Eval: Evaluating Tool Utilization Capability of Large Language Models Step by Step☆229Updated 7 months ago
- AWM: Agent Workflow Memory☆203Updated last month
- Official Repo for UGround☆93Updated this week