RUCAIBox / awesome-llm-pretrainingLinks

Awesome LLM pre-training resources, including data, frameworks, and methods.

☆289

Alternatives and similar repositories for awesome-llm-pretraining

Users that are interested in awesome-llm-pretraining are comparing it to the libraries listed below

Sorting:

thinkwee / AgentsMeetRL
Awesome List for Agentic RL
☆571Updated last week
modelscope / awesome-deep-reasoning
Collect every awesome work about r1!
☆423Updated 7 months ago
DevoAllen / Awesome-Reasoning-Economy-Papers
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models
☆120Updated last month
HarderThenHarder / RLLoggingBoard
A visuailzation tool to make deep understaning and easier debugging for RLHF training.
☆271Updated 9 months ago
pengr / LLM-Synthetic-Data
A live reading list for LLM data synthesis (Updated to July, 2025).
☆420Updated 3 months ago
modelscope / Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆422Updated this week
SuperGPQA / SuperGPQA
☆173Updated 7 months ago
zzz47zzz / awesome-lifelong-learning-methods-for-llm
[ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Model…
☆157Updated 6 months ago
LCLM-Horizon / A-Comprehensive-Survey-For-Long-Context-Language-Modeling
A Comprehensive Survey on Long Context Language Modeling
☆209Updated 2 weeks ago
RUC-GSAI / YuLan-Mini
A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.
☆221Updated 4 months ago
Qihoo360 / Light-R1
☆751Updated 3 months ago
0russwest0 / Awesome-Agent-RL
☆438Updated last month
a-m-team / a-m-models
a-m-team's exploration in large language modeling
☆194Updated 6 months ago
zwhe99 / DeepMath
A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning
☆279Updated 2 months ago
FlagAI-Open / OpenSeek
OpenSeek aims to unite the global open source community to drive collaborative innovation in algorithms, data and systems to develop next…
☆240Updated 3 weeks ago
wjn1996 / Awesome-LLM-Reasoning-Openai-o1-Survey
The related works and background techniques about Openai o1
☆221Updated 11 months ago
RUCAIBox / R1-Searcher
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
☆659Updated 4 months ago
sail-sg / oat-zero
A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
☆248Updated 7 months ago
GAIR-NLP / DeepResearcher
Scaling Deep Research via Reinforcement Learning in Real-world Environments.
☆666Updated last month
GAIR-NLP / ProX
[ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale
☆263Updated 5 months ago
step-law / steplaw
☆207Updated last month
GAIR-NLP / cognition-engineering
Generative AI Act II: Test Time Scaling Drives Cognition Engineering
☆209Updated 7 months ago
cavalierlulu / rag_survey
☆125Updated last year
hscspring / rl-llm-nlp
Reinforcement Learning in LLM and NLP.
☆61Updated 2 months ago
TIGER-AI-Lab / verl-tool
A version of verl to support diverse tool use
☆722Updated last week
qiancheng0 / ToolRL
☆390Updated last month
NiuTrans / ABigSurveyOfLLMs
A collection of 150+ surveys on LLMs
☆342Updated 9 months ago
Eclipsess / Awesome-Efficient-Reasoning-LLMs
[TMLR 2025] Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
☆701Updated last month
GAIR-NLP / LIMR
☆213Updated 9 months ago
RUCAIBox / Slow_Thinking_with_LLMs
A series of technical report on Slow Thinking with LLM
☆751Updated 3 months ago