ZJU-REAL / gui-rcpoLinks
Code for Paper: Test-Time Reinforcement Learning for GUI Grounding via Region Consistency
☆42Updated last month
Alternatives and similar repositories for gui-rcpo
Users that are interested in gui-rcpo are comparing it to the libraries listed below
Sorting:
- ☆123Updated last month
- Code for Let LLMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆48Updated last month
- 最简易的R1结果在小模型上的复现,阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证,对于强推理能力,think思考过程性内容是AGI/ASI的核心。☆44Updated 7 months ago
- ☆30Updated 2 months ago
- ☆27Updated 3 weeks ago
- ☆35Updated this week
- The open-source code of MetaStone-S1.☆108Updated last month
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆38Updated last year
- Official Code for PosterGen☆131Updated last week
- Collection of model-centric MCP servers☆23Updated 3 months ago
- ☆14Updated last year
- ☆18Updated 5 months ago
- [ICCV2025] WikiAutoGen offical page☆17Updated 2 months ago
- ☆34Updated 7 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆53Updated 9 months ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆83Updated 3 months ago
- Efficient Agent Training for Computer Use☆131Updated 2 weeks ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆113Updated 4 months ago
- ☆129Updated 2 weeks ago
- Code for "From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios"☆26Updated 2 months ago
- ☆67Updated 5 months ago
- This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".☆39Updated last month
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆90Updated 10 months ago
- Code for Paper InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models☆35Updated 2 months ago
- ☆37Updated 9 months ago
- ☆81Updated 5 months ago
- ☆23Updated 3 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆46Updated 6 months ago
- Official implementation of "PyVision: Agentic Vision with Dynamic Tooling."☆125Updated last month
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆84Updated 3 months ago