☆54Feb 19, 2025Updated last year
Alternatives and similar repositories for agent_prm
Users that are interested in agent_prm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks☆266May 5, 2025Updated 11 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆161Oct 30, 2024Updated last year
- The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning (NeurIPS 2022)☆16Feb 11, 2023Updated 3 years ago
- ☆21Aug 30, 2025Updated 7 months ago
- Using Vrep to simulate a six-legged robot to do motion planning & path planning☆10Jan 10, 2019Updated 7 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆28Mar 14, 2024Updated 2 years ago
- [TMLR] Process Reward Models That Think☆84Nov 29, 2025Updated 4 months ago
- ☆40Mar 22, 2026Updated 3 weeks ago
- Smart home Agent with Grounded Execution☆28Jul 22, 2024Updated last year
- Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"☆33Jul 25, 2025Updated 8 months ago
- a benchmark to evaluate the situated inductive reasoning☆15Jan 7, 2025Updated last year
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆65Jan 28, 2026Updated 2 months ago
- official implementation of paper "Process Reward Model with Q-value Rankings"☆68Feb 5, 2025Updated last year
- [NAACL'25] Evaluating LLMs for Causal Queries☆13Feb 18, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆25Jun 10, 2025Updated 10 months ago
- This repository contains Reinforcement Learning (RL) environments for the Upkie robot.☆28Mar 11, 2026Updated last month
- code for our paper "Understanding by Understanding Not: Modeling Negation in Language Models"☆16Aug 15, 2022Updated 3 years ago
- ☆35May 29, 2025Updated 10 months ago
- ☆35May 24, 2025Updated 10 months ago
- ☆85Apr 9, 2026Updated last week
- ☆11Oct 3, 2022Updated 3 years ago
- Modified LLaVA framework for MOSS2, and makes MOSS2 a multimodal model.☆13Sep 19, 2024Updated last year
- ☆23Sep 19, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Differentiable non-uniform interpolation: https://arxiv.org/abs/2012.13257☆11Oct 3, 2021Updated 4 years ago
- ☆51Oct 28, 2024Updated last year
- Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision☆19Apr 1, 2025Updated last year
- Training VLM agents with multi-turn reinforcement learning☆444Updated this week
- ICCV'2023: Combating Noisy Labels with Sample Selection by Mining High-Discrepancy Examples☆12Oct 16, 2023Updated 2 years ago
- [ACL 2025] RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios☆26Jul 2, 2025Updated 9 months ago
- Implementation of KDR-Agent, the AAAI 2025 accepted paper, focusing on knowledge-driven reasoning for autonomous agents.☆18Nov 24, 2025Updated 4 months ago
- 武大信图抢座程序 支持后台持续监测,抢靠窗、有电脑的座位 以及抢座成功后自动关机☆15Dec 8, 2022Updated 3 years ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆202Apr 17, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- VehicleWorld is the first comprehensive multi-device environment for intelligent vehicle interaction that accurately models the complex, …☆21Sep 16, 2025Updated 7 months ago
- ⚓️ Repository for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.☆118Oct 27, 2025Updated 5 months ago
- ☆27Jul 18, 2025Updated 8 months ago
- IAN: An Intelligent System for Omics Data Analysis and Discovery☆10Feb 23, 2026Updated last month
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆66Oct 18, 2024Updated last year
- TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision [AAAI2023 Oral]]☆58Feb 25, 2023Updated 3 years ago
- Resources for our paper: "Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training"☆171Oct 20, 2025Updated 5 months ago