Jl-wei / guing
A mobile GUI search engine using a vision-language model
☆12Updated last month
Alternatives and similar repositories for guing:
Users that are interested in guing are comparing it to the libraries listed below
- ☆13Updated 11 months ago
- ☆28Updated 6 months ago
- UICrit is a dataset containing human-generated natural language design critiques, corresponding bounding boxes for each critique, and des…☆20Updated 4 months ago
- Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual Environments☆60Updated 7 months ago
- ☆8Updated last year
- ☆18Updated 6 months ago
- The dataset includes UI object type labels (e.g., BUTTON, IMAGE, CHECKBOX) that describes the semantic type of an UI object on Android ap…☆50Updated 3 years ago
- SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation☆30Updated last month
- ☆25Updated 11 months ago
- Official implementation for "Android in the Zoo: Chain-of-Action-Thought for GUI Agents" (Findings of EMNLP 2024)☆83Updated 6 months ago
- This repository contains the opensource version of the datasets were used for different parts of training and testing of models that grou…☆32Updated 4 years ago
- Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"☆90Updated 7 months ago
- ☆62Updated 3 months ago
- Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations b…☆27Updated 9 months ago
- The model, data and code for the visual GUI Agent SeeClick☆360Updated 4 months ago
- The dataset includes screen summaries that describes Android app screenshot's functionalities. It is used for training and evaluation of …☆55Updated 3 years ago
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆74Updated last week
- FeatureAlignment = Alignment + Mechanistic Interpretability☆28Updated last month
- LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation☆56Updated 8 months ago
- ☆93Updated last year
- GitHub page for "Large Language Model-Brained GUI Agents: A Survey"☆144Updated 2 weeks ago
- A Universal Platform for Training and Evaluation of Mobile Interaction☆44Updated last month
- GUI Odyssey is a comprehensive dataset for training and evaluating cross-app navigation agents. GUI Odyssey consists of 7,735 episodes fr…☆104Updated 5 months ago
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆67Updated last year
- [ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following☆124Updated 9 months ago
- GUICourse: From General Vision Langauge Models to Versatile GUI Agents☆107Updated 9 months ago
- ☆26Updated 9 months ago
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆53Updated last year
- Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"☆49Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆57Updated 6 months ago