DistRL-lab / distrl-openLinks
DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
☆25Updated last week
Alternatives and similar repositories for distrl-open
Users that are interested in distrl-open are comparing it to the libraries listed below
Sorting:
- ☆18Updated 2 months ago
- SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation☆43Updated last month
- Improving Math reasoning through Direct Preference Optimization with Verifiable Pairs☆15Updated 4 months ago
- Official implementation of the NeurIPS 2024 paper CORY☆19Updated 5 months ago
- ☆25Updated 2 months ago
- [ICML 2025] "From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium"☆17Updated last month
- [NeurIPS 2023] Large Language Models Are Semi-Parametric Reinforcement Learning Agents☆34Updated last year
- Code release for "Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search" published at NeurIPS '24.☆11Updated 5 months ago
- Code for NeurIPS 2024 paper "Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs"☆38Updated 5 months ago
- Benchmarking LLMs' Gaming Ability in Multi-Agent Environments☆85Updated 3 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆287Updated last month
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆185Updated 3 months ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆133Updated 3 weeks ago
- ☆21Updated 2 weeks ago
- ☆14Updated 10 months ago
- [ACL'24] Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization☆85Updated 11 months ago
- Rewarded soups official implementation☆58Updated last year
- Code for "UI-R1: Enhancing Efficient Action Prediction of GUI Agents by Reinforcement Learning"☆123Updated 2 months ago
- Direct preference optimization with f-divergences.☆14Updated 9 months ago
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆35Updated last year
- Implementation of the MATRIX framework (ICML 2024)☆58Updated last year
- An index of algorithms for reinforcement learning from human feedback (rlhf))☆93Updated last year
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆72Updated 2 months ago
- ☆199Updated last week
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆271Updated 3 weeks ago
- This is the official implementation of paper "Leveraging Dual Process Theory in Language Agent Framework for Simultaneous Human-AI Collab…☆39Updated 2 months ago
- ☆21Updated 3 weeks ago
- Evaluating Safety of Autonomous Agents in Mobile Device Control☆25Updated 7 months ago
- LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey | Awesome Human-Agent Collaboration | Human-AI Collaboration☆107Updated 2 weeks ago
- A curated list of reinforcement learning with verifiable rewards (continually updated)☆20Updated 2 weeks ago