kyegomez / OpenR1
An open source implementation of R1
☆16Updated 2 weeks ago
Alternatives and similar repositories for OpenR1:
Users that are interested in OpenR1 are comparing it to the libraries listed below
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Updated 7 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆28Updated 2 weeks ago
- ☆94Updated 3 months ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆59Updated last month
- ☆29Updated 4 months ago
- ☆44Updated 3 months ago
- The paper list of multilingual pre-trained models (Continual Updated).☆20Updated 9 months ago
- ☆33Updated 3 weeks ago
- ☆56Updated 6 months ago
- ☆54Updated 5 months ago
- ☆20Updated 4 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- o1 Chain of Thought Examples☆33Updated 5 months ago
- A repository for research on medium sized language models.☆76Updated 10 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆31Updated last month
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆23Updated 2 weeks ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆54Updated last week
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆46Updated last year
- ☆16Updated 5 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆30Updated 10 months ago
- Code and Data for Our NeurIPS 2024 paper "AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback"☆30Updated 4 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆32Updated 5 months ago
- ☆36Updated 6 months ago
- ☆68Updated this week
- [preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆43Updated 3 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆86Updated 5 months ago
- ☆81Updated 11 months ago
- Source code for GreaTer - Gradient Over Reasoning makes Smaller Language Models Strong Prompt Optimizers☆17Updated last month
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆41Updated last year