chunhuizhang / llm_rlLinks

llm & rl

☆256

Alternatives and similar repositories for llm_rl

Users that are interested in llm_rl are comparing it to the libraries listed below

Sorting:

yuanzhoulvpi2017 / nano_rl
在verl上做reward的定制开发
☆130Updated 6 months ago
LightChen233 / Awesome-Long-Chain-of-Thought-Reasoning
Latest Advances on Long Chain-of-Thought Reasoning
☆564Updated 4 months ago
0russwest0 / Awesome-Agent-RL
☆438Updated last month
qiancheng0 / ToolRL
☆390Updated last month
RethinkFun / trian_ppo
☆126Updated last year
thinkwee / AgentsMeetRL
Awesome List for Agentic RL
☆571Updated last week
CJReinforce / PURE
Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"
☆142Updated last month
GAIR-NLP / cognition-engineering
Generative AI Act II: Test Time Scaling Drives Cognition Engineering
☆209Updated 7 months ago
0russwest0 / Agent-R1
Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning
☆1,000Updated last week
modelscope / Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆422Updated this week
RUC-NLPIR / Tool-Star
🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning
☆291Updated last month
ElliottYan / LUFFY
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆380Updated 2 months ago
lqtrung1998 / mwp_ReFT
☆551Updated 11 months ago
yuanzhoulvpi2017 / vscode_debug_transformers
☆399Updated 9 months ago
pengr / LLM-Synthetic-Data
A live reading list for LLM data synthesis (Updated to July, 2025).
☆418Updated 3 months ago
RUC-NLPIR / ARPO
The official code of ARPO & AEPO
☆806Updated 2 weeks ago
liujunwen23 / MIRE
WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge
☆127Updated last year
GAIR-NLP / ToRL
☆316Updated 6 months ago
hscspring / rl-llm-nlp
Reinforcement Learning in LLM and NLP.
☆61Updated 2 months ago
Eclipsess / Awesome-Efficient-Reasoning-LLMs
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
☆701Updated last month
RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆748Updated 3 months ago
lansinuote / Simple_RLHF
☆104Updated 5 months ago
RethinkFun / LLM
☆118Updated last year
wjn1996 / Awesome-LLM-Reasoning-Openai-o1-Survey
The related works and background techniques about Openai o1
☆221Updated 10 months ago
XiaoYee / Awesome_Efficient_LRM_Reasoning
😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, Agent, and Beyond
☆318Updated last month
owenliang / qwen2.5-0.5b-grpo
Qwen2.5 0.5B GRPO
☆71Updated 9 months ago
RyanLiu112 / Awesome-Process-Reward-Models
A comprehensive collection of process reward models.
☆122Updated 2 months ago
HarderThenHarder / RLLoggingBoard
A visuailzation tool to make deep understaning and easier debugging for RLHF training.
☆271Updated 9 months ago
Ablustrund / LoRAMoE
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment
☆386Updated last year
chunhuizhang / personal_chatgpt
personal chatgpt
☆391Updated 11 months ago