Mryangkaitong / deepseek-r1-gsm8kLinks
☆47Updated 5 months ago
Alternatives and similar repositories for deepseek-r1-gsm8k
Users that are interested in deepseek-r1-gsm8k are comparing it to the libraries listed below
Sorting:
- The related works and background techniques about Openai o1☆224Updated 6 months ago
- ☆83Updated last year
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆125Updated 9 months ago
- ☆102Updated 2 months ago
- Fantastic Data Engineering for Large Language Models☆89Updated 7 months ago
- ☆252Updated 3 weeks ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆275Updated 3 weeks ago
- OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning☆146Updated 7 months ago
- A comprehensive collection of process reward models.☆96Updated 2 weeks ago
- this is an implementation for the paper Improve Mathematical Reasoning in Language Models by Automated Process Supervision from google de…☆36Updated 3 weeks ago
- Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"☆132Updated 2 weeks ago
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"☆376Updated 6 months ago