chunhuizhang / llm_rl
llm & rl
☆112Updated last week
Alternatives and similar repositories for llm_rl:
Users that are interested in llm_rl are comparing it to the libraries listed below
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆112Updated 2 weeks ago
- Latest Advances on Long Chain-of-Thought Reasoning☆273Updated 3 weeks ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆203Updated last week
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning