heaplax / ARMAPLinks
☆25Updated 7 months ago
Alternatives and similar repositories for ARMAP
Users that are interested in ARMAP are comparing it to the libraries listed below
Sorting:
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆142Updated 7 months ago
- [ICLR 2024] Trajectory-as-Exemplar Prompting with Memory for Computer Control☆65Updated 2 weeks ago
- ☆66Updated 10 months ago
- Natural Language Reinforcement Learning☆101Updated 5 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆85Updated 8 months ago
- Training VLM agents with multi-turn reinforcement learning☆378Updated this week
- Code for "Reasoning to Learn from Latent Thoughts"☆124Updated 9 months ago
- ☆118Updated 9 months ago
- An Illusion of Progress? Assessing the Current State of Web Agents☆139Updated 2 weeks ago
- ☆117Updated last year
- ☆50Updated 11 months ago
- Verlog: A Multi-turn RL framework for LLM agents☆67Updated this week
- official implementation of paper "Process Reward Model with Q-value Rankings"☆65Updated 11 months ago
- This repository is maintained to release dataset and models for multimodal puzzle reasoning.☆113Updated 10 months ago
- ☆130Updated last month
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆51Updated 6 months ago
- ☆64Updated 2 months ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆51Updated 7 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆160Updated last year
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆147Updated last year
- ☆128Updated last week
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆254Updated 8 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆70Updated 6 months ago
- ☆215Updated 7 months ago
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆92Updated 7 months ago
- ☆348Updated 5 months ago
- ☆51Updated 8 months ago
- ☆122Updated 3 months ago
- Discriminative Constrained Optimization for Reinforcing Large Reasoning Models☆50Updated 2 months ago
- ☆70Updated 7 months ago