hengzzzhou / ReSoLinks
☆14Updated 3 months ago
Alternatives and similar repositories for ReSo
Users that are interested in ReSo are comparing it to the libraries listed below
Sorting:
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆28Updated 7 months ago
- A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models☆24Updated 7 months ago
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 5 months ago
- ☆47Updated 4 months ago
- ☆46Updated 2 months ago
- Natural Language Reinforcement Learning☆90Updated 6 months ago
- B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners☆82Updated last month
- Reinforced Multi-LLM Agents training☆30Updated last month
- ☆47Updated 5 months ago
- ☆21Updated 2 months ago
- ☆48Updated last month
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆67Updated 2 months ago
- ☆48Updated last month
- ☆41Updated 8 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆79Updated last month
- WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning☆27Updated last month
- Official Repository of LatentSeek☆51Updated last month
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆88Updated 3 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆57Updated 4 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆60Updated 5 months ago
- [ACL 2024] PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain☆105Updated last year
- Unsupervised GRPO☆38Updated last month
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆18Updated last month
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆64Updated last month
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Updated 9 months ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆28Updated 3 months ago
- MARFT stands for Multi-Agent Reinforcement Fine-Tuning. This repository implements an LLM-based multi-agent reinforcement fine-tuning fra…☆49Updated last month
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"☆50Updated 8 months ago
- Official repository for "RLVR-World: Training World Models with Reinforcement Learning", https://arxiv.org/abs/2505.13934☆59Updated last month
- ☆46Updated 8 months ago