Pillars-Creation / ChatGLM-RLHF-LoRA-RM-PPOView external linksLinks
ChatGLM-6B添加了RLHF的实现,以及部分核心代码的逐行讲解 ,实例部分是做了个新闻短标题的生成,以及指定context推荐的RLHF的实现
☆88Aug 16, 2023Updated 2 years ago
Alternatives and similar repositories for ChatGLM-RLHF-LoRA-RM-PPO
Users that are interested in ChatGLM-RLHF-LoRA-RM-PPO are comparing it to the libraries listed below
Sorting:
- Here is a demo for PDF parser (Including OCR, object detection tools)☆36Oct 14, 2024Updated last year
- Code for "An Empirical Study of Retrieval Augmented Generation with Chain-of-Thought"☆17Jul 27, 2024Updated last year
- This repository open-sources our GEC system submitted by THU KELab (sz) in the CCL2023-CLTC Track 1: Multidimensional Chinese Learner Tex…☆15Nov 25, 2023Updated 2 years ago
- 在中文开源大模型的基础上进行定制化的微调,拥有自己专属的语言模型。☆51May 20, 2023Updated 2 years ago
- 对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF☆198May 23, 2023Updated 2 years ago
- A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Huma…☆140Apr 28, 2023Updated 2 years ago
- Finetune CPM-1☆24Jun 20, 2021Updated 4 years ago
- 中文二分类,bert+TextCNN 两种实现方法☆26Dec 21, 2022Updated 3 years ago
- Automation Framework using LLM-as-a-judge to evaluate of Agentic AI, RAG, Text2SQL at scale; that is a good proxy for human judgement.☆34Oct 9, 2025Updated 4 months ago
- ☆11Jan 26, 2016Updated 10 years ago
- 用于大模型 RLHF 进行人工数据标注排序的工具。A tool for manual response data annotation sorting in RLHF stage.☆256Aug 1, 2023Updated 2 years ago
- kenlm语言模型,并提供python的rest服务☆30Aug 1, 2018Updated 7 years ago
- ☆59Jul 21, 2025Updated 6 months ago
- Oak National Academy's AI Auto Eval tools provide LLM as a judge evaluation on lesson plans and resources☆17Nov 4, 2025Updated 3 months ago
- Speech to Text with self-supervised learning based on wav2vec 2.0 framework using Hugging Face's Transformer☆29Jun 1, 2021Updated 4 years ago
- ☆10Mar 18, 2024Updated last year
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- [ICML 2025] Official resources of "KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search".☆34Dec 6, 2025Updated 2 months ago
- Code for paper: "PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification", IEEE S&P 2024.☆34Aug 10, 2024Updated last year
- Code for "Balanced Knowledge Distillation for Long-tailed Learning"☆29Oct 19, 2023Updated 2 years ago
- 针对建筑规范文本数据的知识图谱实体关系提取,知识图谱构建,检索增强生成DEMO☆35Aug 7, 2024Updated last year
- RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.☆79Feb 19, 2025Updated 11 months ago
- Dataset for "Video Crowd Localization with Multi-focus Gaussian Neighborhood Attention and a Large-Scale Benchmark"☆35Dec 9, 2025Updated 2 months ago
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- ☆39Feb 9, 2026Updated last week
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- 新词发现/新词挖掘/自由度/凝固度/python3☆10May 28, 2019Updated 6 years ago
- The official implementation of ACL'24 paper: Synergistic Interplay between Search and Large Language Models for Information Retrieval.☆36Jun 6, 2024Updated last year
- ☆147Jul 1, 2024Updated last year
- 一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。☆223Dec 8, 2023Updated 2 years ago
- ☆13Jan 16, 2025Updated last year
- Deploy Yolo series algorithms on Hisilicon platform hi3516, including yolov3, yolov5, yolox, etc☆11Mar 25, 2022Updated 3 years ago
- The code and data for "Summary-Oriented Vision Modeling for Multimodal Abstractive Summarization"☆11May 16, 2023Updated 2 years ago
- 海思设备上部署阉割版yolov5☆13Nov 22, 2021Updated 4 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆15Apr 22, 2021Updated 4 years ago
- Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)☆10Apr 17, 2023Updated 2 years ago
- An implementation of MSSRM method☆11Mar 23, 2023Updated 2 years ago
- 豆瓣电影评论可视化☆10May 19, 2016Updated 9 years ago
- Detects scene change or cuts in a video file☆11Oct 23, 2017Updated 8 years ago