GUIEvalKit: Open-source Evaluation Toolkit for GUI Agents
☆19Feb 26, 2026Updated 2 months ago
Alternatives and similar repositories for guievalkit
Users that are interested in guievalkit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICCV 2025 Highlight] Less is More: Empowering GUI Agent with Context-Aware Simplification☆47Mar 12, 2026Updated last month
- ☆67Sep 6, 2025Updated 8 months ago
- Latest Papers, Codes and Datasets on VTG-LLMs.☆89Nov 17, 2025Updated 5 months ago
- Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents☆30Dec 6, 2024Updated last year
- Source code for Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach (NeurIPS 2023)☆10Dec 12, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code for the paper "Minimum-Delay Adaptation in Non-Stationary Reinforcement Learning via Online High-Confidence Change-Point Detection"☆11Aug 7, 2023Updated 2 years ago
- Official implementation of "TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization" (Findings of ACL …☆21Jul 25, 2025Updated 9 months ago
- This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…☆14Jul 27, 2025Updated 9 months ago
- [NAACL 2024] CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions☆13May 7, 2024Updated 2 years ago
- [CVPR 2026] HiconAgent: History Context-aware Policy Optimization for GUI Agents☆28Mar 9, 2026Updated 2 months ago
- ☆17Oct 30, 2023Updated 2 years ago
- A simple visual test-time scaling method for GUI agent grounding☆24Dec 7, 2025Updated 5 months ago
- Nex General Agentic Data Pipeline, an end-to-end pipeline for generating high-quality agentic training data.☆33Nov 19, 2025Updated 5 months ago
- Open source code for paper "Learning World Models with Identifiable Factorization"☆13Mar 4, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Self-Supervised Dataset Distillation for Transfer Learning☆18Apr 10, 2024Updated 2 years ago
- 记录日常的读书笔记☆17Mar 29, 2021Updated 5 years ago
- The code for HerO: a fact-checking pipeline based on open LLMs (the runner-up in AVeriTeC)☆15Mar 18, 2025Updated last year
- A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases☆14Mar 22, 2022Updated 4 years ago
- LaTeX Proposal Template for the University of Chinese Academy of Sciences☆18Oct 14, 2023Updated 2 years ago
- Official PyTorch implementation of "Multisize Dataset Condensation" (ICLR'24 Oral)☆16Apr 18, 2024Updated 2 years ago
- [AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615☆64Nov 8, 2025Updated 6 months ago
- AI Video Translator / it uses ai to transcribe, translate and then reVoice a video into english in the original speakers voice☆19Jun 21, 2023Updated 2 years ago
- Tensorflow implementation of the `intelligent synapse' model from [Zenke et al., (2017)] and application to the Permuted MNIST benchmark.☆22Aug 2, 2017Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- [AAAI2025 Oral] BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking☆15Apr 22, 2025Updated last year
- ☆75Nov 27, 2024Updated last year
- ☆17Oct 31, 2023Updated 2 years ago
- [ICCV 2025] Implementation of the paper "Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs"☆74Oct 25, 2025Updated 6 months ago
- ☆24Mar 26, 2025Updated last year
- ☆11Dec 4, 2023Updated 2 years ago
- collection for the common dataset in my research☆32Feb 23, 2020Updated 6 years ago
- About Data and Codes for EMNLP 2023 System Demo Paper "QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking"☆19Dec 19, 2023Updated 2 years ago
- ☆24Jul 16, 2024Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Use this template to create the repo for your osmos::feed reader.☆28Feb 14, 2022Updated 4 years ago
- 🎭 Explore the magic of face swapping with "Painted Skin"! 🎭 Upload a source picture or video, select a target picture, and let the fun …☆19Nov 9, 2023Updated 2 years ago
- Large Language Models(LLMs) of Code☆20Apr 23, 2023Updated 3 years ago
- The source code of Paper "PathQG: Neural Question Generation from Facts".☆23Jan 4, 2021Updated 5 years ago
- COOM: Benchmarking Continual Reinforcement Learning on Doom☆24Mar 5, 2026Updated 2 months ago
- ☆16Mar 22, 2024Updated 2 years ago
- [WWW2024 Oral] Harnessing Multi-Role Capabilities of Large Language Models for Open-Domain Question Answering☆15Apr 22, 2025Updated last year