ScienceOne-AI/DeepSeek-671B-SFT-Guide

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ScienceOne-AI/DeepSeek-671B-SFT-Guide)

ScienceOne-AI / DeepSeek-671B-SFT-Guide

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案，包含从训练到推理的完整代码和脚本，以及实践中积累一些经验和结论。)

☆811

Alternatives and similar repositories for DeepSeek-671B-SFT-Guide

Users that are interested in DeepSeek-671B-SFT-Guide are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ScienceOne-AI / DeepLiterature
View on GitHub
DeepLiterature: A fully open-source intelligent research assistant that integrates search, code execution, link resolution, and informati…
☆105Mar 19, 2025Updated last year
wenge-research / CRE-SFT
View on GitHub
A supervised fine-tuning method for controllable reasoning length in large language models (一种通过有监督微调实现大语言模型思考长度可控的方法)
☆11May 8, 2025Updated last year
ScienceOne-AI / S1-Agent
View on GitHub
☆22Aug 1, 2025Updated 11 months ago
wenge-research / TableEval
View on GitHub
This repository contains code and data for the paper "TableEval: A Real-World Benchmark for Complex, Multilingual, and Multi-Structured T…
☆30Jun 12, 2025Updated last year
Qihoo360 / 360-LLaMA-Factory
View on GitHub
adds Sequence Parallelism into LLaMA-Factory
☆607Feb 5, 2026Updated 5 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Qihoo360 / Light-R1
View on GitHub
☆765Dec 23, 2025Updated 7 months ago
wenge-research / YAYI-UIE
View on GitHub
雅意信息抽取大模型：在百万级人工构造的高质量信息抽取数据上进行指令微调，由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)
☆320Aug 8, 2024Updated last year
FareedKhan-dev / train-deepseek-r1
View on GitHub
Building DeepSeek R1 from Scratch
☆782Mar 21, 2025Updated last year
InternLM / xtuner
View on GitHub
A Next-Generation Training Engine Built for Ultra-Large MoE Models
☆5,163Updated this week
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,848Jul 14, 2026Updated last week
wangwei1237 / LLM_in_Action
View on GitHub
Large Language Model in Action
☆342Jan 28, 2025Updated last year
alibaba / ROLL
View on GitHub
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
☆3,325Updated this week
alibaba / ChatLearn
View on GitHub
A flexible and efficient training framework for large-scale alignment tasks
☆452Oct 23, 2025Updated 9 months ago
ZongqianLi / ReasonGraph
View on GitHub
[ACL 2025 Demo] Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths
☆513Mar 9, 2026Updated 4 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
datawhalechina / unlock-deepseek
View on GitHub
DeepSeek 系列工作解读、扩展和复现。
☆733Mar 9, 2026Updated 4 months ago
hiyouga / LlamaFactory
View on GitHub
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
☆73,508Updated this week
modelscope / ms-swift
View on GitHub
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.6, DeepSeek-V4, GLM-5.1, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL…
☆14,946Updated this week
areal-project / AReaL
View on GitHub
The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.
☆5,599Updated this week
modelscope / evalscope
View on GitHub
A streamlined and customizable framework for efficient large model (LLM, VLM, AIGC) evaluation and performance benchmarking.
☆3,139Updated this week
alibaba / Pai-Megatron-Patch
View on GitHub
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
☆1,585Dec 15, 2025Updated 7 months ago
rllm-org / rllm
View on GitHub
Democratizing Reinforcement Learning for LLMs
☆5,731Updated this week
Open-Reasoner-Zero / Open-Reasoner-Zero
View on GitHub
Official Repo for Open-Reasoner-Zero
☆2,096Jun 2, 2025Updated last year
huggingface / open-r1
View on GitHub
Fully open reproduction of DeepSeek-R1
☆26,414Apr 2, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
yangjianxin1 / Firefly
View on GitHub
Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、…
☆6,647Oct 24, 2024Updated last year
dhcode-cpp / X-R1
View on GitHub
minimal-cost for training 0.5B R1-Zero
☆816May 14, 2025Updated last year
GAIR-NLP / O1-Journey
View on GitHub
O1 Replication Journey
☆2,001Jan 14, 2025Updated last year
hkust-nlp / simpleRL-reason
View on GitHub
Simple RL training for reasoning
☆3,870Dec 23, 2025Updated 7 months ago
ItusiAI / Open-PDF2MD
View on GitHub
PDF2MD是一个高效的PDF到Markdown转换工具，旨在帮助用户轻松将PDF文档转换为Markdown格式，便于编辑、分享和发布。通过简洁易用的界面和强大的转换功能，PDF2MD成为内容创作者、研究人员和开发者的得力助手。
☆184Oct 11, 2025Updated 9 months ago
multimodal-art-projection / MAP-NEO
View on GitHub
☆985Feb 7, 2025Updated last year
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,654Updated this week
openreasoner / openr
View on GitHub
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
☆1,848Jan 17, 2025Updated last year
mbzuai-oryx / Awesome-LLM-Post-training
View on GitHub
Awesome Reasoning LLM Tutorial/Survey/Guide
☆2,485Apr 6, 2026Updated 3 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
ADaM-BJTU / OpenRFT
View on GitHub
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
☆157Dec 24, 2024Updated last year
ScienceOne-AI / S1-DeepResearch
View on GitHub
☆26Jul 2, 2026Updated 3 weeks ago
MozerWang / AMPO
View on GitHub
[ICLR 2026] Adaptive Social Learning via Mode Policy Optimization for Language Agents
☆51Feb 2, 2026Updated 5 months ago
OpenRaiser / GGBench
View on GitHub
📐 [CVPR 2026] GGBench: A Geometric Generative Reasoning Benchmark for Unified Multimodal Models
☆18Apr 1, 2026Updated 3 months ago
LianjiaTech / BELLE
View on GitHub
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
☆8,277Oct 16, 2024Updated last year
hpcaitech / ColossalAI
View on GitHub
Making large AI models cheaper, faster and more accessible
☆41,427Jul 13, 2026Updated last week
WangRongsheng / awesome-LLM-resources
View on GitHub
🧑‍🚀 全世界最好的LLM资料总结（多模态生成、Agent、辅助编程、AI审稿、数据处理、模型训练、模型推理、o1 模型、MCP、小语言模型、视觉语言模型） | Summary of the world's best LLM resources.
☆8,743Updated this week