changyeyu / LLM-RL-VisualizedLinks
LLM, RL, DPO, SFT, Distillation, Alignment. 由《大模型算法》作者发起(By the author of the book📘 "Large Model Algorithms")
☆44Updated 2 weeks ago
Alternatives and similar repositories for LLM-RL-Visualized
Users that are interested in LLM-RL-Visualized are comparing it to the libraries listed below
Sorting:
- ☆15Updated 7 months ago
- ☆78Updated 8 months ago
- llm & rl☆139Updated this week
- ⭐️ Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning.☆157Updated 2 weeks ago
- 通过动画学强化学习笔记☆53Updated 3 months ago
- ☆83Updated last month
- A curated list of visual reinforcement learning resources☆282Updated 2 weeks ago
- A comprehensive collection of process reward models.☆88Updated 2 weeks ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆123Updated this week
- Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...☆72Updated last month
- 训练一个对中文支持更好的LLaVA模型,并开源训练代码和数据。☆60Updated 9 months ago
- This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems☆82Updated 2 months ago
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆180Updated 2 months ago
- [CVPR 2025] RoboBrain: A Unified Brain Model for Robotic Manipulation from Abstract to Concrete. Official Repository.☆220Updated this week
- ☆76Updated 9 months ago
- Run TRex with PPO☆38Updated 3 weeks ago
- Latest Advances on Embodied Multimodal LLMs (or Vison-Language-Action Models).☆114Updated 11 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆129Updated last month
- ☆329Updated 3 months ago
- WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge☆123Updated 6 months ago
- 主要记录大语言大模型(LLMs) 算法(应用)工程师多模态相关知识☆198Updated last year
- [ICLR 2025 Oral] PyTorch code for the paper "Open-World Reinforcement Learning over Long Short-Term Imagination"☆121Updated this week
- An Easy-to-use, Scalable and High-performance RLHF Framework designed for Multimodal Models.☆127Updated 2 months ago
- ICLR 2025 Agent-Related Papers☆71Updated 6 months ago
- Official Repo for Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning☆363Updated 5 months ago
- DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)☆164Updated last year
- ☆217Updated 2 weeks ago
- Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks☆122Updated last week
- ☆38Updated 2 months ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆141Updated 5 months ago