jackaduma / ChatGLM-LoRA-RLHF-PyTorch
A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM
☆134Updated last year
Alternatives and similar repositories for ChatGLM-LoRA-RLHF-PyTorch:
Users that are interested in ChatGLM-LoRA-RLHF-PyTorch are comparing it to the libraries listed below
- 对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF☆192Updated last year
- 使用qlora对中文大语言模型进行微调,包含ChatGLM、Chinese-LLaMA-Alpaca、BELLE☆85Updated last year
- deep learning☆150Updated 7 months ago
- Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat☆112Updated last year
- 用于大模型 RLHF 进行人工数据标注排序的工具。A tool for manual response data annotation sorting in RLHF stage.☆247Updated last year
- ChatGLM2-6B微调, SFT/LoRA, instruction finetune☆105Updated last year
- chatglm-6b微调/LORA/PPO/推理, 样本为自动生成的整数/小数加减乘除运算, 可gpu/cpu☆164Updated last year
- ChatGLM-6B添加了RLHF的实现,以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成,以及指定context推荐的RLHF的实现☆82Updated last year
- ☆304Updated last year
- Implementation of Chinese ChatGPT☆287Updated last year
- ☆159Updated last year
- alpaca中文指令微调数据集☆392Updated last year
- 一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。☆209Updated last year
- 微调ChatGLM☆124Updated last year
- chatglm2 6b finetuning and alpaca finetuning☆145Updated 10 months ago
- 基于ChatGPT构建的中文self-instruct数据集☆113Updated last year
- 专注于中文领域大语言模型,落地到某个行业某个领域,成为一个行业大模型、公司级别或行业级别领域大模型。☆115Updated 5 months ago
- pCLUE: 1000000+多任务提示学习数据集☆478Updated 2 years ago
- 探索中文instruct数据在ChatGLM, LLaMA上的微调表现☆390Updated last year
- Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型☆406Updated last year
- ☆278Updated 9 months ago
- llama2 finetuning with deepspeed and lora☆172Updated last year
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆96Updated 9 months ago
- 大语言模型指令调优工具(支持 FlashAttention)☆169Updated last year
- deepspeed+trainer简单高效实现多卡微调大模型☆122Updated last year
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆54Updated last year
- 怎么训练一个LLM分词器☆140Updated last year
- baichuan LLM surpervised finetune by lora☆62Updated last year
- ChatGLM-6B fine-tuning.☆135Updated last year
- chatglm多gpu用deepspeed和☆405Updated 7 months ago