YuxiangChai / A3Links
☆32Updated 3 months ago
Alternatives and similar repositories for A3
Users that are interested in A3 are comparing it to the libraries listed below
Sorting:
- ☆69Updated 4 months ago
- (ICLR 2025) The Official Code Repository for GUI-World.☆66Updated 10 months ago
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆92Updated 3 months ago
- [NeurIPS 2025 Spotlight] Official repository for "Web-Shepherd: Advancing PRMs for Reinforcing Web Agents"☆47Updated 4 months ago
- HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models☆52Updated 10 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆58Updated 7 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆163Updated last week
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆130Updated 4 months ago
- [NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis☆115Updated 4 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆46Updated 7 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆59Updated last year
- ☆62Updated 4 months ago
- ☆31Updated last year
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆12Updated 10 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆89Updated 3 weeks ago
- [ACL 2025] Agentic Knowledgeable Self-awareness☆85Updated 4 months ago
- [ACL 2025] Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆106Updated 4 months ago
- ☆99Updated last week
- JudgeLRM: Large Reasoning Models as a Judge☆39Updated last month
- Towards Large Multimodal Models as Visual Foundation Agents☆239Updated 5 months ago
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆76Updated last month
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆126Updated 6 months ago
- [ACL 2025] An inference-time decoding strategy with adaptive foresight sampling☆104Updated 5 months ago
- [NeurIPS 2025 Spotlight] ReasonFlux-Coder: Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆125Updated last month
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆131Updated last year
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆65Updated 4 months ago
- ☆43Updated last week
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆37Updated 2 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 8 months ago
- The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆170Updated 3 months ago