☆55Feb 19, 2025Updated last year
Alternatives and similar repositories for agent_prm
Users that are interested in agent_prm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆268May 5, 2025Updated last year
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆163Oct 30, 2024Updated last year
- The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)☆16Feb 11, 2023Updated 3 years ago
- Using Vrep to simulate a six-legged robot to do motion planning & path planning☆10Jan 10, 2019Updated 7 years ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆28Mar 14, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [TMLR] Process Reward Models That Think☆89Nov 29, 2025Updated 6 months ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆30Aug 9, 2025Updated 9 months ago
- Agentic Virtual Lab☆19Nov 30, 2025Updated 5 months ago
- Smart home Agent with Grounded Execution☆31Jul 22, 2024Updated last year
- ☆43May 10, 2026Updated 2 weeks ago
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆33Jul 25, 2025Updated 10 months ago
- a benchmark to evaluate the situated inductive reasoning☆15Jan 7, 2025Updated last year
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆68Jan 28, 2026Updated 4 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆69Feb 5, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆25Jun 10, 2025Updated 11 months ago
- Demonstrating the BadAss issue.☆17May 19, 2025Updated last year
- code for our paper "Understanding by Understanding Not: Modeling Negation in Language Models"☆16Aug 15, 2022Updated 3 years ago
- [ICLR'26] Stronger-MAS: A RL Framework for multi LLM agent system; [arxiv] MetaAgent-X: End-to-End Reinforcement Learning Automatic Mult…☆173May 15, 2026Updated 2 weeks ago
- ☆35May 24, 2025Updated last year
- This repository contains Reinforcement Learning (RL) environments for the Upkie robot.☆30Updated this week
- ☆20Feb 25, 2026Updated 3 months ago
- ☆11Oct 3, 2022Updated 3 years ago
- VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments☆23Sep 30, 2025Updated 7 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆73Apr 22, 2025Updated last year
- ☆23Sep 19, 2024Updated last year
- ☆50Oct 28, 2024Updated last year
- Differentiable non-uniform interpolation: https://arxiv.org/abs/2012.13257☆11Oct 3, 2021Updated 4 years ago
- ☆10Feb 18, 2020Updated 6 years ago
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆19Apr 1, 2025Updated last year
- [ACL 2025] RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios☆26Jul 2, 2025Updated 10 months ago
- Training VLM agents with multi-turn reinforcement learning☆461May 11, 2026Updated 2 weeks ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆205Apr 17, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆28Jul 18, 2025Updated 10 months ago
- IAN: An Intelligent System for Omics Data Analysis and Discovery☆10Feb 23, 2026Updated 3 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆66Oct 18, 2024Updated last year
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆170Oct 20, 2025Updated 7 months ago
- ☆22Jul 9, 2025Updated 10 months ago
- Regularly Truncated M-estimators for Learning with Noisy Labels☆12Apr 24, 2024Updated 2 years ago
- CVPR 2026 - MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation☆53Mar 23, 2026Updated 2 months ago