crazycth / WizardLearnerLinks

Pretrain、decay、SFT a CodeLLM from scratch 🧙‍♂️

☆39

Alternatives and similar repositories for WizardLearner

Users that are interested in WizardLearner are comparing it to the libraries listed below

Sorting:

SkyworkAI / skywork-o1-prm-inference
☆65Updated 10 months ago
TemporaryLoRA / Temp-LoRA
☆116Updated last year
RUC-GSAI / YuLan-Mini
A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.
☆219Updated 2 months ago
Wangmerlyn / MCTS-GSM8k-Demo
This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems
☆92Updated 6 months ago
jackfsuia / nanoRLHF
RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.
☆73Updated 7 months ago
yyht / openrlhf_async_pipline
☆81Updated last month
HarderThenHarder / RLLoggingBoard
A visuailzation tool to make deep understaning and easier debugging for RLHF training.
☆256Updated 7 months ago
modelscope / Trinity-RFT
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (…
☆366Updated this week
IAAR-Shanghai / xVerify
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
☆133Updated 5 months ago
cavalierlulu / rag_survey
☆125Updated last year
dhcode-cpp / grpo-loss
☆33Updated 7 months ago
OpenBMB / UltraEval
[ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.
☆251Updated 11 months ago
wjn1996 / Awesome-LLM-Reasoning-Openai-o1-Survey
The related works and background techniques about Openai o1
☆222Updated 9 months ago
nishiwen1214 / Benchmark-leakage-detection
Official completion of “Training on the Benchmark Is Not All You Need”.
☆36Updated 9 months ago
step-law / steplaw
☆203Updated 6 months ago
yiyepiaoling0715 / codellm-data-preprocess-pipeline
代码大模型预训练&微调&DPO 数据处理业界处理pipeline sota
☆44Updated last year
FlagOpen / Infinity-Instruct
☆49Updated last year
hengjiUSTC / learn-llm
☆115Updated 11 months ago
chunhuizhang / prompts_for_academic
☆48Updated 3 weeks ago
InternLM / OREAL
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning
☆190Updated 6 months ago
LCLM-Horizon / A-Comprehensive-Survey-For-Long-Context-Language-Modeling
A Comprehensive Survey on Long Context Language Modeling
☆192Updated 3 months ago
ADaM-BJTU / OpenRFT
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
☆151Updated 9 months ago
a-m-team / a-m-models
a-m-team's exploration in large language modeling
☆188Updated 4 months ago
SkyworkAI / Skywork-MoE
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
☆137Updated last year
SuperGPQA / SuperGPQA
☆169Updated 5 months ago
sail-sg / sdft
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
☆131Updated 11 months ago
MiroMindAI / MiroRL
MiroRL is an MCP-first reinforcement learning framework for deep research agent.
☆164Updated last month
RUC-GSAI / Llama-3-SynE
Llama-3-SynE: A Significantly Enhanced Version of Llama-3 with Advanced Scientific Reasoning and Chinese Language Capabilities | 继续预训练提升 …
☆34Updated 4 months ago
jiahe7ay / infini-mini-transformer
This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…
☆58Updated last year
DIRECT-BIT / SRA-MCTS
☆33Updated 4 months ago