ShwStone / TRex-PPOLinks
Run TRex with PPO
☆39Updated 3 months ago
Alternatives and similar repositories for TRex-PPO
Users that are interested in TRex-PPO are comparing it to the libraries listed below
Sorting:
- [ICLR 2025 Oral] PyTorch code for the paper "Open-World Reinforcement Learning over Long Short-Term Imagination"☆154Updated 2 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆138Updated 4 months ago
- llm & rl☆198Updated last week
- Awesome_CV的中文版本,clone本项目到overleaf即可轻松愉快编写自己的CV☆12Updated last year
- A curated list of visual reinforcement learning resources☆369Updated 2 months ago
- ☆366Updated 6 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆246Updated 6 months ago
- ☆208Updated last week
- MLLM @ Game☆14Updated 3 months ago
- A Telegram bot to recommend arXiv papers☆280Updated 4 months ago
- ☆198Updated 4 months ago
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆141Updated 4 months ago
- 📖 This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.☆213Updated this week
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆163Updated 3 months ago
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆141Updated this week
- ☆361Updated 2 weeks ago
- Open Platform for Embodied Agents☆326Updated 7 months ago
- ☆40Updated 3 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆87Updated 5 months ago
- minimal-cost for training 0.5B R1-Zero☆765Updated 3 months ago
- modern AI for beginners☆153Updated last week
- DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)☆176Updated last year
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory☆173Updated last month
- Collected the world's best computer vision labs and lecture materials.☆14Updated 6 months ago
- This repository provides a comprehensive library for parallel training and LoRA algorithm implementations, supporting multiple parallel s…☆48Updated this week
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆223Updated 2 weeks ago
- 这是一个open-r1的复现项目,对0.5B、1.5B、3B、7B的qwen模型进行GRPO训练,观察到一些有趣的现象。☆42Updated 4 months ago
- An reconstruction of RL Introduction and its course materials for a more efficient entry☆14Updated 3 months ago
- ☆102Updated 11 months ago
- ☆110Updated 4 months ago