☆35Jun 20, 2024Updated last year
Alternatives and similar repositories for CoCo-Agent
Users that are interested in CoCo-Agent are comparing it to the libraries listed below
Sorting:
- [ACL 2025] Research code for the paper "OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents"☆18Jun 19, 2025Updated 8 months ago
- ☆12Aug 8, 2024Updated last year
- Official implementation for "You Only Look at Screens: Multimodal Chain-of-Action Agents" (Findings of ACL 2024)☆255Jul 16, 2024Updated last year
- Official implementation for "Android in the Zoo: Chain-of-Action-Thought for GUI Agents" (Findings of EMNLP 2024)☆99Oct 14, 2024Updated last year
- [ICCV 2025] GUIOdyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUIOdyssey consists of 8,834 e…☆147Jan 3, 2026Updated 2 months ago
- Repository of GUI Action Narrator☆13Apr 8, 2025Updated 10 months ago
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…☆13Jul 27, 2025Updated 7 months ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆68Jan 7, 2026Updated last month
- The model, data and code for the visual GUI Agent SeeClick☆469Jul 13, 2025Updated 7 months ago
- ☆31Sep 27, 2024Updated last year
- Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.☆389Feb 22, 2025Updated last year
- This repository contains code and datasets for our paper on the effects of document multiplicity while the context size is fixed in Retri…☆18Mar 13, 2025Updated 11 months ago
- Visual and Embodied Concepts evaluation benchmark☆21Oct 10, 2023Updated 2 years ago
- Code for "The Expressive Power of Low-Rank Adaptation".☆20Apr 19, 2024Updated last year
- [AAAI-2026] Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"☆146Nov 24, 2025Updated 3 months ago
- A Dead Simple and Modularized Multi-Modal Training and Finetune Framework. Compatible to any LLaVA/Flamingo/QwenVL/MiniGemini etc series …☆19Apr 24, 2024Updated last year
- SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation☆60Jul 11, 2025Updated 7 months ago
- Code of "Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model"☆23Jun 28, 2024Updated last year
- [NeurIPS 2025] UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents☆53Nov 27, 2025Updated 3 months ago
- This tool allows local LLM usage that can automate tasks without human interventention. The agent can call itself recursively and work on…☆20May 5, 2025Updated 10 months ago
- LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation☆67Aug 9, 2024Updated last year
- Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations b…☆34Jun 27, 2024Updated last year
- ☆35Sep 30, 2024Updated last year
- GPT-4V in Wonderland: LMMs as Smartphone Agents☆135Jul 17, 2024Updated last year
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆39Jun 20, 2024Updated last year
- This repo contains the code to reproduce figures in my dissertation "Passive Imaging and Characterization of the Subsurface With Distribu…☆10Jun 14, 2018Updated 7 years ago
- A python library for making API calls to Bonsai BRAIN.☆15Oct 6, 2022Updated 3 years ago
- ☆11Mar 11, 2024Updated last year
- FPGA Low latency 10GBASE-R PCS☆12May 23, 2023Updated 2 years ago
- Official code repo for the paper "LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark"☆46May 16, 2025Updated 9 months ago
- AndroidWorld is an environment and benchmark for autonomous agents☆640Feb 24, 2026Updated last week
- Python SIR-x model implementation☆10Dec 8, 2022Updated 3 years ago
- [ACM MM 2025] LMM4Edit: Benchmarking and Evaluating Multimodal Image Editing with LMMs☆15Feb 10, 2026Updated 3 weeks ago
- A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.☆38Sep 9, 2024Updated last year
- ☆21Dec 11, 2025Updated 2 months ago
- Tensorflow implementation of the paper "Fast Compressive Sensing Using Generative Model with Structed Latent Variables"☆10Apr 7, 2020Updated 5 years ago
- Tool for image-based control RDP (Remote Desktop Protocol). Manipulations, automations and testing via Python and Apache Guacamole☆14Nov 16, 2022Updated 3 years ago
- The final project of the 42 core curriculum☆12Nov 19, 2025Updated 3 months ago
- ☆15Mar 11, 2025Updated 11 months ago