ShwStone / TRex-PPO
Run TRex with PPO
☆26Updated this week
Alternatives and similar repositories for TRex-PPO
Users that are interested in TRex-PPO are comparing it to the libraries listed below
Sorting:
- [ICLR 2025 Oral] PyTorch code for the paper "Open-World Reinforcement Learning over Long Short-Term Imagination"☆112Updated last week
- 这是一个高效,快捷的arXiv论文爬虫,它可以将指定时间范围,指定主题,包含指定关键词的论文信息爬取到本地,并且将其中的标题和摘要翻译成中文。☆102Updated 8 months ago
- ☆136Updated this week
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆104Updated last week
- A Telegram bot to recommend arXiv papers☆270Updated last month
- ICLR 2025 Agent-Related Papers☆70Updated 6 months ago
- ☆184Updated last month
- A vue-based project page template for academic papers. (in development) https://junyaohu.github.io/academic-project-page-template-vue☆258Updated 3 weeks ago
- Official Code for "Coser: Coordinating LLM-Based Persona Simulation of Established Roles"☆77Updated last month
- Sharing my research toolchain☆83Updated last year
- MLLM @ Game☆14Updated last week
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆77Updated last month
- [NeurIPS 2024] Official Implementation for Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks☆73Updated last month
- Open Platform for Embodied Agents☆317Updated 4 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆173Updated last week
- An open-source lightweight game generation paradigm. It includes everything from data processing to model architecture design and playabi…☆86Updated 4 months ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆176Updated last month
- ☆37Updated this week
- ☆102Updated last month
- [NeurIPSw'24] This repo is the official implementation of "MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simu…☆89Updated 3 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆122Updated last month
- This tool will daily crawl https://arxiv.org and use LLMs to summarize them.☆86Updated this week
- A research repo for experiments about Reinforcement Finetuning☆46Updated last month
- [CVPR2024] This is the official implement of MP5☆101Updated 10 months ago
- Awesome RL Reasoning Recipes ("Triple R")☆544Updated last week
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆72Updated 3 weeks ago
- A comprehensive collection of process reward models.☆76Updated last week
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆121Updated last month
- ☆76Updated last month
- An ML research template with good documentation by Boyuan Chen, an MIT PhD student☆71Updated 2 months ago