jackaduma/ChatGLM-LoRA-RLHF-PyTorch

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/jackaduma/ChatGLM-LoRA-RLHF-PyTorch)

jackaduma / ChatGLM-LoRA-RLHF-PyTorch

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM

☆138

Alternatives and similar repositories for ChatGLM-LoRA-RLHF-PyTorch

Users that are interested in ChatGLM-LoRA-RLHF-PyTorch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ssbuild / chatglm_rlhf
View on GitHub
chatglm_rlhf_finetuning
☆30Oct 10, 2023Updated 2 years ago
Miraclemarvel55 / ChatGLM-RLHF
View on GitHub
对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF
☆196May 23, 2023Updated 3 years ago
jackaduma / Alpaca-LoRA-RLHF-PyTorch
View on GitHub
A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…
☆60Apr 28, 2023Updated 3 years ago
crossmodaldebate / NCA-SEM
View on GitHub
NCA-SEM module for Jamovi. Necessary Condition Analysis via Structural Equation Modeling (NCA-SEM) is a data analysis method that is used…
☆15Jun 24, 2026Updated last month
jackaduma / Vicuna-LoRA-RLHF-PyTorch
View on GitHub
A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human…
☆220May 20, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yangzhipeng1108 / DeepSpeed-Chat-ChatGLM
View on GitHub
☆43Dec 15, 2023Updated 2 years ago
thinksoso / ChatGLM-Instruct-Tuning
View on GitHub
微调ChatGLM
☆128May 5, 2023Updated 3 years ago
ssbuild / moss_finetuning
View on GitHub
moss chat finetuning
☆51Apr 23, 2024Updated 2 years ago
jasonvanf / llama-trl
View on GitHub
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
☆240Aug 17, 2025Updated 11 months ago
vicgalle / zero-shot-reward-models
View on GitHub
ZYN: Zero-Shot Reward Models with Yes-No Questions
☆34Aug 15, 2023Updated 2 years ago
hiyouga / ChatGLM-Efficient-Tuning
View on GitHub
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
☆3,719Oct 12, 2023Updated 2 years ago
liangwq / Chatglm_lora_multi-gpu
View on GitHub
chatglm多gpu用deepspeed和
☆409Jul 8, 2024Updated 2 years ago
Pillars-Creation / ChatGLM-RLHF-LoRA-RM-PPO
View on GitHub
ChatGLM-6B添加了RLHF的实现，以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成，以及指定context推荐的RLHF的实现
☆88Aug 16, 2023Updated 2 years ago
l294265421 / alpaca-rlhf
View on GitHub
Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
☆118Jun 5, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
hscspring / hcgf
View on GitHub
Humanable Chat Generative-model Fine-tuning | LLM微调
☆205Sep 22, 2023Updated 2 years ago
SpongebBob / Finetune-ChatGLM2-6B
View on GitHub
ChatGLM2-6B 全参数微调，支持多轮对话的高效微调。
☆400Aug 17, 2023Updated 2 years ago
OpenLMLab / MOSS-RLHF
View on GitHub
Secrets of RLHF in Large Language Models Part I: PPO
☆1,426Mar 3, 2024Updated 2 years ago
HarderThenHarder / transformers_tasks
View on GitHub
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SF…
☆2,420Sep 29, 2023Updated 2 years ago
jiamingkong / rwkv_reward
View on GitHub
Training a reward model for RLHF using RWKV.
☆15Jun 5, 2023Updated 3 years ago
WHU-ZQH / PANDA
View on GitHub
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation
☆16Mar 28, 2023Updated 3 years ago
aaron-wheeler / MarketGPT
View on GitHub
MarketGPT: Developing a Pre-trained transformer (GPT) for Modeling Financial Time Series
☆19Sep 5, 2025Updated 10 months ago
ChiYeungLaw / LLaMa-EasyFT
View on GitHub
A Toolkit for Fine-Tuning Large Language Models with LoRA and DeepSpeed
☆11Apr 14, 2023Updated 3 years ago
imClumsyPanda / ChatGLM-6B-API
View on GitHub
self-host ChatGLM-6B API made with fastapi
☆78Mar 24, 2023Updated 3 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
Miraclemarvel55 / LLaMA-MOSS-RLHF-LoRA
View on GitHub
用RLHF可选LoRA对LLaMA和MOSS进行训练|Training LLaMA or MOSS with RLHF [LoRA]
☆21May 16, 2023Updated 3 years ago
ssbuild / chatglm_finetuning
View on GitHub
chatglm 6b finetuning and alpaca finetuning
☆1,528Mar 9, 2025Updated last year
ProSeCo-Planning / ros_proseco_planning
View on GitHub
The ROS interface as well as the Python packages for ProSeCo Planning
☆10Jun 17, 2024Updated 2 years ago
alexa / places
View on GitHub
This is the code for our paper: PLACES: Prompting Language Models for Social Conversation Synthesis
☆11Feb 17, 2023Updated 3 years ago
yongzhuo / ChatGLM2-SFT
View on GitHub
ChatGLM2-6B微调, SFT/LoRA, instruction finetune
☆108Jul 19, 2023Updated 3 years ago
Instruction-Tuning-with-GPT-4 / GPT-4-LLM
View on GitHub
Instruction Tuning with GPT-4
☆4,332Jun 11, 2023Updated 3 years ago
hiyouga / ChatNVL-Towards-Visual-Novel-ChatBot
View on GitHub
☆23Jun 23, 2023Updated 3 years ago
liucongg / ChatGLM-Finetuning
View on GitHub
基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型，进行下游具体任务微调，涉及Freeze、Lora、P-tuning、全参微调等
☆2,774Dec 12, 2023Updated 2 years ago
HumanSignal / RLHF
View on GitHub
Collection of links, tutorials and best practices of how to collect the data and build end-to-end RLHF system to finetune Generative AI m…
☆226Jul 24, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sunzeyeah / RLHF
View on GitHub
Implementation of Chinese ChatGPT
☆287Nov 20, 2023Updated 2 years ago
StarRing2022 / MiniRWKV-4
View on GitHub
实现Blip2RWKV+QFormer的多模态图文对话大模型，使用Two-Step Cognitive Psychology Prompt方法，仅3B参数的模型便能够出现类人因果思维链。对标MiniGPT-4，ImageBind等图文对话大语言模型，力求以更小的算力和资源实…
☆42Jul 17, 2023Updated 3 years ago
CarperAI / trlx
View on GitHub
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
☆4,753Jan 8, 2024Updated 2 years ago
lxe / llama-tune
View on GitHub
LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers
☆50Mar 15, 2023Updated 3 years ago
opendilab / awesome-RLHF
View on GitHub
A curated list of reinforcement learning with human feedback resources (continually updated)
☆4,416May 20, 2026Updated 2 months ago
smaameri / private-llm
View on GitHub
Python scripts for setting up private LLM's on local and in the cloud with LangChain, GPT4All and Cerebrium
☆11May 29, 2023Updated 3 years ago
dipjyoti92 / TTS-Style-Transfer
View on GitHub
Official PyTorch implementation of TTS Style Transfer
☆25Jun 22, 2022Updated 4 years ago