deepseek-ai / DeepSeek-Prover-V2
☆963Updated last week
Alternatives and similar repositories for DeepSeek-Prover-V2:
Users that are interested in DeepSeek-Prover-V2 are comparing it to the libraries listed below
- Releases from OpenAI Preparedness☆729Updated last month
- ☆516Updated 8 months ago
- Muon is Scalable for LLM Training☆1,043Updated last month
- Democratizing Reinforcement Learning for LLMs☆3,210Updated last month
- MoBA: Mixture of Block Attention for Long-Context LLMs☆1,771Updated last month
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL☆2,196Updated this week
- Dream 7B, a large diffusion language model☆622Updated last week
- Technical report of Kimina-Prover Preview.☆278Updated this week
- Pretraining code for a large-scale depth-recurrent language model☆756Updated 3 weeks ago
- Understanding R1-Zero-Like Training: A Critical Perspective☆915Updated 3 weeks ago
- Textbook on reinforcement learning from human feedback☆883Updated this week
- RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.☆1,772Updated this week
- Training Large Language Model to Reason in a Continuous Latent Space☆1,104Updated 3 months ago
- Unleashing the Power of Reinforcement Learning for Math and Code Reasoners☆547Updated 2 weeks ago
- Atom of Thoughts for Markov LLM Test-Time Scaling☆562Updated this week
- ☆524Updated 3 weeks ago
- Scalable RL solution for advanced reasoning of language models☆1,537Updated last month
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,300Updated 3 weeks ago
- ☆3,332Updated 2 months ago
- LIMO: Less is More for Reasoning☆933Updated last month
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆445Updated last month
- The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention☆2,586Updated last month
- A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.☆2,758Updated 2 months ago
- Simple RL training for reasoning☆3,540Updated last month
- ☆739Updated 3 weeks ago
- ☆430Updated 9 months ago
- DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models☆2,689Updated last year
- Official Repo for Open-Reasoner-Zero☆1,912Updated last month
- Official PyTorch implementation for "Large Language Diffusion Models"☆1,576Updated last week
- Analyze computation-communication overlap in V3/R1.☆1,018Updated last month