OctopusMind / DPOLinks

dpo算法实现

☆47

Alternatives and similar repositories for DPO

Users that are interested in DPO are comparing it to the libraries listed below

Sorting:

suu990901 / LLaMA-MiLe-Loss
Code for a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models
☆65Updated 8 months ago
yanqiangmiffy / how-to-train-tokenizer
怎么训练一个LLM分词器
☆153Updated 2 years ago
sugarandgugu / Simple-Trl-Training
基于DPO算法微调语言大模型，简单好上手。
☆45Updated last year
ZhuiyiTechnology / GAU-alpha
基于Gated Attention Unit的Transformer模型（尝鲜版）
☆98Updated 2 years ago
hengjiUSTC / learn-llm
☆115Updated 11 months ago
Glanvery / LLM-Travel
欢迎来到 "LLM-travel" 仓库！探索大语言模型（LLM）的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。
☆344Updated last year
firechecking / CleanTransformer
an implementation of transformer, bert, gpt, and diffusion models for learning purposes
☆159Updated last year
beichao1314 / Open-Llama
The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
☆67Updated 2 years ago
taishan1994 / sentencepiece_chinese_bpe
使用sentencepiece中BPE训练中文词表，并在transformers中进行使用。
☆119Updated 2 years ago
akaihaoshuai / baby-llama2-chinese_cybertron
使用单个24G显卡，从0开始训练LLM
☆56Updated 3 months ago
DRSY / EMO
[ICLR 2024]EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling(https://arxiv.org/abs/2310.04691)
☆126Updated last year
Mryangkaitong / deepseek-r1-gsm8k
☆47Updated 8 months ago
bojone / bytepiece
更纯粹、更高压缩率的Tokenizer
☆485Updated 10 months ago
jiahe7ay / infini-mini-transformer
This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…
☆58Updated last year
yuanzhoulvpi2017 / SentenceEmbedding
☆119Updated last year
xv44586 / Chinese-instruction-datasets
中文 Instruction tuning datasets
☆137Updated last year
yongzhuo / qwen2-sft
Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理
☆68Updated last year
genggui001 / Megatron-DeepSpeed-Llama
☆84Updated 2 years ago
NEU-DataMining / PICA
多轮共情对话模型PICA
☆97Updated 2 years ago
zhoucz97 / awesome-ChatGPT
ChatGPT相关资源汇总
☆56Updated 2 years ago
MikeGu721 / EasyLLM
make LLM easier to use
☆59Updated 2 years ago
zejunwang1 / LLMTuner
大语言模型指令调优工具（支持 FlashAttention）
☆178Updated last year
OpenBMB / ModelCenter
Efficient, Low-Resource, Distributed transformer implementation based on BMTrain
☆263Updated last year
stanleylsx / llms_tool
一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测，低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。
☆220Updated last year
keezen / ntk_alibi
NTK scaled version of ALiBi position encoding in Transformer.
☆69Updated 2 years ago
CLUEbenchmark / SuperCLUE-Math6
SuperCLUE-Math6：新一代中文原生多轮多步数学推理数据集的探索之旅
☆60Updated last year
ArtificialZeng / Baichuan2-Explained
Baichuan2代码的逐行解析版本，适合小白
☆214Updated 2 years ago
muyaostudio / qwen2_seq_cls
使用 Qwen2ForSequenceClassification 简单实现文本分类任务。
☆82Updated last year
ssbuild / llm_finetuning
Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on
☆98Updated last year
1140310118 / tdlm
实现了Transformer中的几种位置编码方案
☆44Updated 4 years ago