jackaduma / ChatGLM-LoRA-RLHF-PyTorchLinks
A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM
☆140Updated 2 years ago
Alternatives and similar repositories for ChatGLM-LoRA-RLHF-PyTorch
Users that are interested in ChatGLM-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
Sorting:
- deep learning☆149Updated 9 months ago
- 对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF☆198Updated 2 years ago
- 使用qlora对中文大语言模型进行微调,包含ChatGLM、Chinese-LLaMA-Alpaca、BELLE☆89Updated 2 years ago
- alpaca中文指令微调数据集☆397Updated 2 years ago
- chatglm-6b微调/LORA/PPO/推理, 样本为自动生成的整数/小数加减乘除运算, 可gpu/cpu☆165Updated 2 years ago
- Implementation of Chinese ChatGPT☆288Updated 2 years ago
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆99Updated last year
- ChatGLM2-6B微调, SFT/LoRA, instruction finetune☆110Updated 2 years ago
- 一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。☆223Updated 2 years ago
- ☆313Updated 2 years ago
- Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.☆619Updated last year
- 用于大模型 RLHF 进行人工数据标注排序的工具。A tool for manual response data annotation sorting in RLHF stage.☆257Updated 2 years ago
- Baichuan2代码的逐行解析版本,适合小白☆213Updated 2 years ago
- 微调ChatGLM☆128Updated 2 years ago
- Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型☆416Updated 2 years ago
- 大语言模型指令调优工具(支持 FlashAttention)☆177Updated 2 years ago
- 专注于中文领域大语言模型,落地到某个行业某个领域,成为一个行业大模型、公司级别或行业级别领域大模型。☆126Updated 11 months ago
- llama2 finetuning with deepspeed and lora☆176Updated 2 years ago
- ☆43Updated 2 years ago
- The open source implementation of DeepSeek-R1. 开源复现 DeepSeek-R1☆275Updated 11 months ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆117Updated 2 years ago
- 语言模型中文认知能力分析☆235Updated 2 years ago
- ☆164Updated 2 years ago
- ☆282Updated last year
- deepspeed+trainer简单高效实现多卡微调大模型☆133Updated 2 years ago
- Chinese large language model base generated through incremental pre-training on Chinese datasets☆239Updated 2 years ago
- moss chat finetuning☆51Updated last year
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆55Updated 2 years ago
- chatglm多gpu用deepspeed和☆409Updated last year
- 怎么训练一个LLM分词器☆153Updated 2 years ago