Zeyi-Lin / easy-r1Links

Train deepseek r1-like reasoning LLM with ease | 轻松训练1个deepseek r1类的推理LLM

☆17

Alternatives and similar repositories for easy-r1

Users that are interested in easy-r1 are comparing it to the libraries listed below

Sorting:

inclusionAI / AWorld-RL
Agentic Learning Powered by AWorld
☆42Updated this week
liangyuwang / zo2
ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory [COLM2025]
☆192Updated 3 months ago
percent4 / llm_math_solver
本项目用于大模型数学解题能力方面的数据集合成，模型训练及评测，相关文章记录。
☆97Updated last year
StarRing2022 / R1-Nature
最简易的R1结果在小模型上的复现，阐述类O1与DeepSeek R1最重要的本质。Think is all your need。利用实验佐证，对于强推理能力，think思考过程性内容是AGI/ASI的核心。
☆44Updated 9 months ago
Alannikos / FunGPT
In this fast-paced world, we all need a little something to spice up life. Whether you need a glass of sweet talk to lift your spirits or…
☆60Updated 5 months ago
Wangmerlyn / MCTS-GSM8k-Demo
This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems
☆91Updated 7 months ago
yongzhuo / qwen2-sft
Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理
☆68Updated last year
KMnO4-zx / tiny-llm
☆27Updated 4 months ago
seanzhang-zhichen / Qwen-WisdomVast
Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …
☆18Updated last year
liangwq / deepSeekRecurrence
deepseek思维树模式实现
☆21Updated 3 months ago
GuoYiFantastic / IMelodist
Music large model based on InternLM2-chat.
☆22Updated 10 months ago
Bui1dMySea / MemLong
☆95Updated 11 months ago
M1n9X / GraphRAG_Lite
☆16Updated last year
zhaochenyang20 / Prompt2Model-Self-Guide
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper
☆33Updated last year
owenliang / qwen-dpo
通义千问的DPO训练
☆56Updated last year
MonolithFoundation / Bumblebee
A Simple MLLM Surpassed QwenVL-Max with OpenSource Data Only in 14B LLM.
☆38Updated last year
LC1332 / awesome-colab-project
Awesome Colab Projects Collection
☆29Updated last year
dhcode-cpp / grpo-loss
☆33Updated 8 months ago
jiahe7ay / infini-mini-transformer
This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…
☆58Updated last year
limafang / tiny-graphrag
☆43Updated 6 months ago
modelscope / easydistill
a toolkit on knowledge distillation for large language models
☆191Updated 3 weeks ago
OpenRL-Lab / Ray_Tutorial
Tutorial for Ray
☆35Updated last year
xverse-ai / XVERSE-MoE-A4.2B
XVERSE-MoE-A4.2B: A multilingual large language model developed by XVERSE Technology Inc.
☆39Updated last year
FreedomIntelligence / FastLLM
Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];
☆41Updated last year
cooper12121 / llama3-8x8b-MoE
Copy the MLP of llama3 8 times as 8 experts , created a router with random initialization,add load balancing loss to construct an 8x8b Mo…
☆27Updated last year
scchy / XtunerGUI
Xtuner Factory
☆34Updated last year
Chen-GX / C-3PO
[ICML2025] The official implementation of "C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Gene…
☆40Updated 6 months ago
826568389 / GRPO-R1
☆13Updated 7 months ago
shuyhere / all-about-llm
大语言模型训练和服务调研
☆36Updated 2 years ago
ignorejjj / LongRefiner
The code for paper: Hierarchical Document Refinement for Long-context Retrieval-augmented Generation [ACL2025 Oral]
☆36Updated 2 months ago