Paul33333 / SFT-and-DPOLinks

This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)

☆18

Alternatives and similar repositories for SFT-and-DPO

Users that are interested in SFT-and-DPO are comparing it to the libraries listed below

Sorting:

yuanzhoulvpi2017 / SentenceEmbedding
☆119Updated last year
Sshuoshuo / easy-rag
快速入门RAG与私有化部署
☆212Updated last year
owenliang / bpe-tokenizer
LLM Tokenizer with BPE algorithm
☆46Updated last year
yang19527 / AwesomeInterview
包含程序员面试大厂面试题和面试经验
☆204Updated 7 months ago
liujunwen23 / MIRE
WWW2025 Multimodal Intent Recognition for Dialogue Systems Challenge
☆130Updated last year
Mxoder / LLM-from-scratch
一些 LLM 方面的从零复现笔记
☆241Updated 8 months ago
jiahe7ay / MINI_LLM
This is a repository used by individuals to experiment and reproduce the pre-training process of LLM.
☆486Updated 8 months ago
taishan1994 / Llama3.1-Finetuning
对llama3进行全参微调、lora微调以及qlora微调。
☆212Updated last year
KMnO4-zx / TinyRAG
TinyRAG
☆398Updated 6 months ago
mianshi7 / LLM
该仓库主要记录大模型（LLMs）算法工程师相关的面试题与我写的答案
☆27Updated 2 years ago
Glanvery / LLM-Travel
欢迎来到 "LLM-travel" 仓库！探索大语言模型（LLM）的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。
☆359Updated last year
LDLINGLINGLING / adan_application
一些大语言模型和多模态模型的生态,主要包括跨模态搜索、投机解码、QAT量化、多模态量化、ChatBot、OCR
☆194Updated 4 months ago
Zeyi-Lin / Qwen3-Medical-SFT
Qwen3 Fine-tuning: Medical R1 Style Chat
☆261Updated 7 months ago
OvJat / DeepSpeedTutorial
DeepSpeed Tutorial
☆104Updated last year
dawoshi / Tianchi-LLM-QA
阿里天池: 2023全球智能汽车AI挑战赛——赛道一：AI大模型检索问答 baseline 80+
☆119Updated 2 years ago
AI-Study-Han / Zero-Qwen-VL
训练一个对中文支持更好的LLaVA模型，并开源训练代码和数据。
☆77Updated last year
km1994 / AwesomeNLP
此项目完成了关于 NLP-Beginner：自然语言处理入门练习的所有任务（文本分类、信息抽取、知识图谱、机器翻译、问答系统、文本生成、Text-to-SQL、文本纠错、文本挖掘、知识蒸馏、模型加速、OCR、TTS、Prompt、embedding等），所有代码都经过测试…
☆215Updated 2 years ago
AI-Study-Han / Zero-Chatgpt
从0开始，将chatgpt的技术路线跑一遍。
☆270Updated last year
shyoulala / Kaggle_Eedi_2024_sayoulala
kaggle 2024 Eedi 第10名金牌方案
☆44Updated last year
taishan1994 / sentencepiece_chinese_bpe
使用sentencepiece中BPE训练中文词表，并在transformers中进行使用。
☆120Updated 2 years ago
chunhuizhang / bert_t5_gpt
☆82Updated last month
taishan1994 / pytorch-distributed-NLP
pytorch分布式训练
☆73Updated 2 years ago
chunhuizhang / personal_chatgpt
personal chatgpt
☆403Updated last year
km1994 / llms_paper
该仓库主要记录 LLMs 算法工程师相关的顶会论文研读笔记（多模态、PEFT、小样本QA问答、RAG、LMMs可解释性、Agents、CoT）
☆369Updated last year
muyaostudio / qwen2_seq_cls
使用 Qwen2ForSequenceClassification 简单实现文本分类任务。
☆89Updated last year
sunkx109 / llama
Inference code for LLaMA models
☆128Updated 2 years ago
owenliang / qwen-dpo
通义千问的DPO训练
☆61Updated last year
yongzhuo / qwen2-sft
Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理
☆69Updated last year
wdndev / tiny-rag
一个很小很小的RAG系统
☆331Updated 8 months ago
lansinuote / Simple_LLM_DPO
☆77Updated 2 years ago