ShwStone / TRex-PPOLinks
Run TRex with PPO
☆39Updated 7 months ago
Alternatives and similar repositories for TRex-PPO
Users that are interested in TRex-PPO are comparing it to the libraries listed below
Sorting:
- [ICLR 2025 Oral] PyTorch code for the paper "Open-World Reinforcement Learning over Long Short-Term Imagination"☆188Updated 2 months ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]☆198Updated 5 months ago
- llm & rl☆266Updated 2 months ago
- Training VLM agents with multi-turn reinforcement learning☆365Updated last week
- Awesome_CV的中文版本,clone本项目到overleaf即可轻松愉快编写自己的CV☆14Updated last year
- siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems☆327Updated this week
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆278Updated 10 months ago
- ☆409Updated 11 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆146Updated 9 months ago
- ☆133Updated last year
- 这是一个高效,快捷的arXiv论文爬虫,它可以将指定时间范围,指定主题,包含指定关键词的论文信息爬取到本地,并且将其中的标题和摘要翻译成中文。☆169Updated last year
- 这是一个open-r1的复现项目,对0.5B、1.5B、3B、7B的qwen模型进行GRPO训练,观察到一些有趣的现象。☆54Updated 8 months ago
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆154Updated 3 months ago
- ☆104Updated last month
- ☆118Updated 9 months ago
- This repository provides a comprehensive library for parallel training and LoRA algorithm implementations, supporting multiple parallel s…☆52Updated last month
- ☆480Updated 3 months ago
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning☆404Updated last year
- The development and future prospects of large multimodal reasoning models.☆568Updated 5 months ago
- Open Platform for Embodied Agents☆336Updated 11 months ago
- Official Repository of "Learning to Reason under Off-Policy Guidance"☆395Updated 3 months ago
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆185Updated 3 months ago
- The official implementation of the paper "Mem-α: Learning Memory Construction via Reinforcement Learning"☆141Updated 2 weeks ago
- Curation of resources for LLM research, screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise de…☆63Updated last year
- 青稞Talk☆181Updated last week
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆247Updated 4 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆94Updated last month
- ☆185Updated last week
- Qwen2.5 0.5B GRPO☆75Updated 10 months ago
- 在没有sudo权限的情况下,在linux上使用clash☆165Updated last year