sunzeyeah/RLHF

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sunzeyeah/RLHF)

sunzeyeah / RLHF

Implementation of Chinese ChatGPT

☆287

Alternatives and similar repositories for RLHF

Users that are interested in RLHF are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Miraclemarvel55 / ChatGLM-RLHF
View on GitHub
对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF
☆196May 23, 2023Updated 3 years ago
GanjinZero / RRHF
View on GitHub
[NIPS2023] RRHF & Wombat
☆806Sep 22, 2023Updated 2 years ago
HarderThenHarder / transformers_tasks
View on GitHub
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SF…
☆2,420Sep 29, 2023Updated 2 years ago
carbonz0 / alpaca-chinese-dataset
View on GitHub
alpaca中文指令微调数据集
☆395Mar 26, 2023Updated 3 years ago
yanqiangmiffy / InstructGLM
View on GitHub
ChatGLM-6B 指令学习|指令数据|Instruct
☆651Apr 10, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
LianjiaTech / BELLE
View on GitHub
BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）
☆8,273Oct 16, 2024Updated last year
CVI-SZU / Linly
View on GitHub
Chinese-LLaMA 1&2、Chinese-Falcon 基础模型；ChatFlow中文对话模型；中文OpenLLaMA模型；NLP预训练/指令微调数据集
☆3,045Apr 14, 2024Updated 2 years ago
PKU-Alignment / safe-rlhf
View on GitHub
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
☆1,611Nov 24, 2025Updated 7 months ago
yangjianxin1 / LLMPruner
View on GitHub
☆309Apr 6, 2023Updated 3 years ago
hikariming / chat-dataset-baseline
View on GitHub
人工精调的中文对话数据集和一段chatglm的微调代码
☆1,190May 3, 2025Updated last year
OpenLMLab / MOSS-RLHF
View on GitHub
Secrets of RLHF in Large Language Models Part I: PPO
☆1,427Mar 3, 2024Updated 2 years ago
xubuvd / LLMs
View on GitHub
专注于中文领域大语言模型，落地到某个行业某个领域，成为一个行业大模型、公司级别或行业级别领域大模型。
☆125May 19, 2026Updated 2 months ago
yongzhuo / chatglm-maths
View on GitHub
chatglm-6b微调/LORA/PPO/推理, 样本为自动生成的整数/小数加减乘除运算, 可gpu/cpu
☆165Aug 24, 2023Updated 2 years ago
Facico / Chinese-Vicuna
View on GitHub
Chinese-Vicuna: A Chinese Instruction-following LLaMA-based Model —— 一个中文低资源的llama+lora方案，结构参考alpaca
☆4,119Apr 18, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
mymusise / ChatGLM-Tuning
View on GitHub
基于ChatGLM-6B + LoRA的Fintune方案
☆3,744Nov 25, 2023Updated 2 years ago
yangzhipeng1108 / DeepSpeed-Chat-ChatGLM
View on GitHub
☆43Dec 15, 2023Updated 2 years ago
l294265421 / alpaca-rlhf
View on GitHub
Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
☆118Jun 5, 2023Updated 3 years ago
dandelionsllm / pandallm
View on GitHub
Panda项目是于2023年5月启动的开源海外中文大语言模型项目，致力于大模型时代探索整个技术栈，旨在推动中文自然语言处理领域的创新和合作。
☆1,032Oct 19, 2023Updated 2 years ago
deepspeedai / DeepSpeedExamples
View on GitHub
Example models using DeepSpeed
☆6,832Updated this week
StarRing2022 / ChatGPTX-Uni
View on GitHub
实现一种多Lora权值集成切换+Zero-Finetune零微调增强的跨模型技术方案，LLM-Base+LLM-X+Alpaca，初期，LLM-Base为Chatglm6B底座模型，LLM-X是LLAMA增强模型。该方案简易高效，目标是使此类语言模型能够低能耗广泛部署，并最…
☆114Jul 19, 2023Updated 3 years ago
27182812 / ChatGLM-LLaMA-chinese-insturct
View on GitHub
探索中文instruct数据在ChatGLM, LLaMA上的微调表现
☆387Apr 4, 2023Updated 3 years ago
Instruction-Tuning-with-GPT-4 / GPT-4-LLM
View on GitHub
Instruction Tuning with GPT-4
☆4,332Jun 11, 2023Updated 3 years ago
PhoebusSi / Alpaca-CoT
View on GitHub
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tunin…
☆2,791Dec 12, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
CarperAI / trlx
View on GitHub
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,753Jan 8, 2024Updated 2 years ago
flippe3 / chat-ltu
View on GitHub
Open source implementation of InstructGPT (not finished)
☆31Apr 13, 2023Updated 3 years ago
LC1332 / Luotuo-Chinese-LLM
View on GitHub
骆驼(Luotuo): Open Sourced Chinese Language Models. Developed by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子昂 @ 商汤科技
☆3,590Sep 3, 2023Updated 2 years ago
jianzhnie / Open-R1
View on GitHub
The open source implementation of DeepSeek-R1. 开源复现 DeepSeek-R1
☆275Mar 10, 2025Updated last year
hiyouga / ChatGLM-Efficient-Tuning
View on GitHub
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
☆3,719Oct 12, 2023Updated 2 years ago
lonePatient / awesome-pretrained-chinese-nlp-models
View on GitHub
Awesome Pretrained Chinese NLP Models，高质量中文预训练模型&大模型&多模态模型&大语言模型集合
☆5,571Jun 19, 2026Updated last month
genggui001 / Megatron-DeepSpeed-Llama
View on GitHub
☆84Sep 9, 2023Updated 2 years ago
yangjianxin1 / Firefly
View on GitHub
Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、…
☆6,646Oct 24, 2024Updated last year
thomfoster / minRLHF
View on GitHub
A (somewhat) minimal library for finetuning language models with PPO on human feedback.
☆91Nov 23, 2022Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
clue-ai / ChatYuan
View on GitHub
ChatYuan: Large Language Model for Dialogue in Chinese and English
☆1,868Jun 16, 2023Updated 3 years ago
vxfla / kanchil
View on GitHub
Kanchil（鼷鹿）是世界上最小的偶蹄目动物，这个开源项目意在探索小模型（6B以下）是否也能具备和人类偏好对齐的能力。
☆112Apr 1, 2023Updated 3 years ago
THUDM / GLM
View on GitHub
GLM (General Language Model)
☆3,611Nov 3, 2023Updated 2 years ago
X-PLUG / ChatPLUG
View on GitHub
A Chinese Open-Domain Dialogue System
☆324Aug 16, 2023Updated 2 years ago
bigscience-workshop / Megatron-DeepSpeed
View on GitHub
Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆1,448Mar 20, 2024Updated 2 years ago
OpenLLMAI / OpenLLMWiki
View on GitHub
OpenLLMWiki: Docs of OpenLLMAI. Survey, reproduction and domain/task adaptation of open source chatgpt alternatives/implementations. PiXi…
☆269Dec 10, 2024Updated last year
baichuan-inc / Baichuan-7B
View on GitHub
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
☆5,652Jul 18, 2024Updated 2 years ago