Pillars-Creation/ChatGLM-RLHF-LoRA-RM-PPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Pillars-Creation/ChatGLM-RLHF-LoRA-RM-PPO)

Pillars-Creation / ChatGLM-RLHF-LoRA-RM-PPO

ChatGLM-6B添加了RLHF的实现，以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成，以及指定context推荐的RLHF的实现

☆88

Alternatives and similar repositories for ChatGLM-RLHF-LoRA-RM-PPO

Users that are interested in ChatGLM-RLHF-LoRA-RM-PPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Miraclemarvel55 / ChatGLM-RLHF
View on GitHub
对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF
☆196May 23, 2023Updated 3 years ago
WalkerMitty / PDFparser
View on GitHub
Here is a demo for PDF parser (Including OCR, object detection tools)
☆36Oct 14, 2024Updated last year
jackaduma / ChatGLM-LoRA-RLHF-PyTorch
View on GitHub
A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…
☆138Apr 28, 2023Updated 3 years ago
ArtificialZeng / baichuan-speedup
View on GitHub
纯c++的全平台llm加速库，支持python调用，支持baichuan, glm, llama, moss基座，手机端流畅运行chatglm-6B级模型单卡可达10000+token / s，
☆42Aug 16, 2023Updated 2 years ago
zstar1003 / PaddleOCR-Torch-Infer
View on GitHub
从MinerU中提取出来的文本检测识别部分，通过pytorch实现paddleocr的文本检测识别
☆20Jun 2, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
wellinxu / LLM_Custome
View on GitHub
在中文开源大模型的基础上进行定制化的微调，拥有自己专属的语言模型。
☆53May 20, 2023Updated 3 years ago
bayjarvis / llm
View on GitHub
Fine-tuning, DPO, RLHF, RLAIF on LLMs - Qwen3, Zephyr 7B GPTQ with 4-Bit Quantization, Mistral-7B-GPTQ
☆15Jul 5, 2025Updated last year
ArtificialZeng / ChatGLM-Efficient-Tuning-Explained
View on GitHub
☆23Jul 17, 2023Updated 3 years ago
WenSongWang / Bert_TextCNN_Chinese_classification_Pytorch
View on GitHub
中文二分类，bert+TextCNN 两种实现方法
☆28Dec 21, 2022Updated 3 years ago
zhangxiann / BertPractice
View on GitHub
使用 Bert 进行文本分类
☆20Dec 7, 2021Updated 4 years ago
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated last year
RuYunW / ADG-Seq2Seq
View on GitHub
the implementation of Embedding API Dependency Graph for Neural Code Generation
☆12Jun 6, 2021Updated 5 years ago
ssm123ssm / docGPT-pharm
View on GitHub
☆10Aug 3, 2023Updated 2 years ago
chunhuizhang / personal_chatgpt
View on GitHub
personal chatgpt
☆415Jan 11, 2026Updated 6 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MashiMaroLjc / TimeExtractor
View on GitHub
针对口语进行时间抽取并标准化
☆13Mar 2, 2020Updated 6 years ago
yongzhuo / chatglm-maths
View on GitHub
chatglm-6b微调/LORA/PPO/推理, 样本为自动生成的整数/小数加减乘除运算, 可gpu/cpu
☆165Aug 24, 2023Updated 2 years ago
pamaforce / OneKE-RAG
View on GitHub
基于 OneKE 的知识图谱构建与 RAG 问答系统搭建
☆27Jun 29, 2024Updated 2 years ago
uglyghost / ChatGLM-Peft-Tuning
View on GitHub
ChatGLM-Peft-Tuning
☆13Mar 19, 2023Updated 3 years ago
Run542968 / GAP
View on GitHub
☆11Oct 13, 2024Updated last year
rainstorm12 / KG-RAG
View on GitHub
简单实现了一下基于知识图谱和文本文档联合做检索增强(RAG)大模型的实现，这里采用的数据分别是管廊维护领域的文本文档和专家知识图谱
☆24Jun 6, 2024Updated 2 years ago
yzhangcs / master-thesis
View on GitHub
基于树形条件随机场的高阶句法分析
☆16Apr 28, 2022Updated 4 years ago
wenhycs / EMNLP2021-Utilizing-Relative-Event-Time-to-Enhance-Event-Event-Temporal-Relation-Extraction
View on GitHub
☆12Oct 4, 2021Updated 4 years ago
go-nlp / bm25
View on GitHub
bm25 is a scoring function that helps with information retrieval
☆14Sep 17, 2020Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ZBayes / basic_rag
View on GitHub
basic framework for rag(retrieval augment generation)
☆89Dec 24, 2023Updated 2 years ago
stanleylsx / llms_tool
View on GitHub
一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测，低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。
☆226Dec 8, 2023Updated 2 years ago
thu-coai / CritiqueLLM
View on GitHub
☆147Jul 1, 2024Updated 2 years ago
rosehe1029 / questionLLM
View on GitHub
收集整理大模型面试题
☆12Aug 29, 2024Updated last year
codedogfish / angular-ui-tree
View on GitHub
☆11Jan 26, 2016Updated 10 years ago
mala-lab / OpenCIL
View on GitHub
Official code for paper "OpenCIL: Benchmarking Out-of-Distribution Detection in Class-Incremental Learning"
☆13Jun 19, 2024Updated 2 years ago
BAAI-WuDao / P-tuning
View on GitHub
Finetune CPM-1
☆24Jun 20, 2021Updated 5 years ago
MaseratiD / Sentiment_Analysis
View on GitHub
基于bert的文本情感分析
☆12Nov 4, 2022Updated 3 years ago
Capino512 / pinyin2hanzi_python
View on GitHub
词、句拼音转汉字、拼音分割、拼音补全、pygame输入中文
☆15Mar 21, 2020Updated 6 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
SupritYoung / RLHF-Label-Tool
View on GitHub
用于大模型 RLHF 进行人工数据标注排序的工具。A tool for manual response data annotation sorting in RLHF stage.
☆254Aug 1, 2023Updated 2 years ago
zejunwang1 / chatglm_tuning
View on GitHub
基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调
☆55May 17, 2023Updated 3 years ago
thu-spmi / RAG-CoT
View on GitHub
Code for "An Empirical Study of Retrieval Augmented Generation with Chain-of-Thought"
☆18Jul 27, 2024Updated last year
Jometeorie / MultiHopShortcuts
View on GitHub
Reproduction Code for Paper "Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models"
☆14Jun 1, 2024Updated 2 years ago
AmbientTalk / wePoker
View on GitHub
wePoker is a multi-player poker game for Android
☆12Mar 20, 2013Updated 13 years ago
graphprojects / HyGCL-AdT
View on GitHub
The official source code for HyGCL-AdT that is published to WWW 24.
☆12Mar 12, 2024Updated 2 years ago
longlongint / Fin-PTPCG
View on GitHub
Code of the paper “A Fin-BERT-based Event Extraction Method for Chinese Financial Domain”
☆12May 22, 2024Updated 2 years ago