jackaduma / ChatGLM-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM
☆135Updated 2 years ago
Alternatives and similar repositories for ChatGLM-LoRA-RLHF-PyTorch
Users that are interested in ChatGLM-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- 对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF☆194Updated 2 years ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆115Updated 2 years ago