ssbuild/llm_rlhf

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ssbuild/llm_rlhf)

ssbuild / llm_rlhf

realize the reinforcement learning training for gpt2 llama bloom and so on llm model

☆27

Alternatives and similar repositories for llm_rlhf

Users that are interested in llm_rlhf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ssbuild / llm_finetuning
View on GitHub
Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on
☆99Apr 24, 2024Updated 2 years ago
ysunbp / RECA-paper
View on GitHub
Code and data for the VLDB 2023 paper: RECA: Related Tables Enhanced Column Semantic Type Annotation Framework
☆12May 7, 2025Updated last year
yangyaofei / docker-openclash
View on GitHub
Docker image for OpenClash
☆12Dec 20, 2022Updated 3 years ago
megagonlabs / sudowoodo
View on GitHub
The source code of the Sudowoodo paper in ICDE 2023
☆19May 24, 2023Updated 3 years ago
LoveCatc / supervised-llm-uncertainty-estimation
View on GitHub
This repo contains code for paper: "Uncertainty Estimation and Quantification for LLMs: A Simple Supervised Approach".
☆26Oct 21, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
OpenBuddy / GrandSage
View on GitHub
☆16May 31, 2024Updated 2 years ago
owenonline / Knowledge-Graph-Reasoning-with-Self-supervised-Reinforcement-Learning
View on GitHub
Reinforcement learning (RL) is an effective method to find reasoning pathways in incomplete knowledge graphs (KGs). To overcome the chall…
☆26May 30, 2026Updated last month
masonsxu / TextLabel
View on GitHub
一款数据标注工具（仿照百度在线标注平台）
☆13Jul 5, 2021Updated 5 years ago
megagonlabs / starmie
View on GitHub
Resources for PVLDB 2023 submission
☆29Aug 28, 2024Updated last year
LHRLAB / HAHE
View on GitHub
[ACL 2023] Official resources of "HAHE: Hierarchical Attention for Hyper-Relational Knowledge Graphs in Global and Local Level".
☆28Aug 18, 2025Updated 11 months ago
mfarisadip / T5-rlhf-pytorch
View on GitHub
Implementation of RLHF (Reinforcement Learning with Human Feedback) and GAN (Generative Adversarial Network) on top of the T5 architectur…
☆17Jan 2, 2023Updated 3 years ago
ruc-datalab / Unicorn
View on GitHub
☆32Apr 15, 2023Updated 3 years ago
wangf3014 / VTok
View on GitHub
Official implementation of VTok: A Unified Video Tokenizer with Decoupled Spatial-Temporal Latents
☆15Feb 5, 2026Updated 5 months ago
Miraclemarvel55 / ChatGLM-RLHF
View on GitHub
对ChatGLM直接使用RLHF提升或降低目标输出概率|Modify ChatGLM output with only RLHF
☆196May 23, 2023Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Zeng-WH / Prompt-Tuning
View on GitHub
Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"
☆59Jun 27, 2022Updated 4 years ago
lilongxian / BaiYang-chatGLM2-6B
View on GitHub
（1）弹性区间标准化的旋转位置词嵌入编码器+peft LORA量化训练，提高万级tokens性能支持。（2）证据理论解释学习，提升模型的复杂逻辑推理能力（3）兼容alpaca数据格式。
☆43Jul 19, 2023Updated 3 years ago
arnab-api / Logit-Lens-Interpreting-GPT-2
View on GitHub
☆16Jan 31, 2023Updated 3 years ago
jakelever / knowledgediscovery
View on GitHub
Analysis code for knowledge discovery project
☆12Sep 25, 2018Updated 7 years ago
CLUEbenchmark / Math24o
View on GitHub
Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark
☆14Mar 27, 2025Updated last year
NMS05 / DinoV2-BERT-CLIP
View on GitHub
A simple PyTorch implementation of CLIP model using DinoV2 and BERT
☆16Sep 26, 2023Updated 2 years ago
easonnie / ResEncoder
View on GitHub
This repo is for residual-connected sentence encoder for NLI.
☆11Jan 21, 2018Updated 8 years ago
lijiaqi0612 / UIE-ACL-310
View on GitHub
有一个通用实体关系事件抽取的任务，需要使用到UIE模框架，而且需要将起部署到昇腾310服务器上，因为UIE模型底层使用的是ernie3.0，但是目前paddle官方还不支持ernie3.0模型在昇腾310上部署，所以才有了以下的操作，主要过程是，先试用paddle训练处模型…
☆21Aug 1, 2022Updated 3 years ago
cambridgeltl / SIPHS
View on GitHub
☆15Sep 20, 2018Updated 7 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
oobabooga / AI-Notebooks
View on GitHub
☆13Oct 22, 2023Updated 2 years ago
stogiannidis / srbench
View on GitHub
Source code for the Paper "Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models"
☆19Feb 1, 2026Updated 5 months ago
RHKeng / ShenCeCup
View on GitHub
A competition on DataCastle which is about text keyword extraction ! Rank 6 / 622 !
☆16Jan 27, 2019Updated 7 years ago
greenelab / knowledge-graph-review
View on GitHub
A literature review for constructing and using knowledge graphs in a biomedical setting.
☆11May 22, 2020Updated 6 years ago
oobabooga / EasyLM
View on GitHub
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆11Apr 26, 2023Updated 3 years ago
HazyResearch / ddbiolib
View on GitHub
DeepDive Biomedical Tools
☆15Apr 3, 2017Updated 9 years ago
sauc-abadal / ALT
View on GitHub
Official repository for ALT (ALignment with Textual feedback).
☆10Jul 25, 2024Updated 2 years ago
pome223 / ModalMixLab
View on GitHub
☆14Feb 7, 2025Updated last year
zhu-minjun / SafetyLock
View on GitHub
Your finetuned model's back to its original safety standards faster than you can say "SafetyLock"!
☆11Oct 16, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
pacman100 / peft-codegen-25
View on GitHub
☆23Jul 10, 2023Updated 3 years ago
Leesine / code2graph
View on GitHub
☆10Dec 9, 2024Updated last year
floriscornel / ChatRPC
View on GitHub
ChatRPC is a framework that allows large language models to interact with external services.
☆10Dec 25, 2023Updated 2 years ago
Li-TianCheng / MemPool
View on GitHub
c++高性能内存池
☆11May 10, 2021Updated 5 years ago
THUDM / FasterTransformer
View on GitHub
Transformer related optimization, including BERT, GPT
☆39Feb 10, 2023Updated 3 years ago
dengweihuan / t-SNE
View on GitHub
t-Distributed Stochastic Neighbor Embedding applyed on the hyperspectral dataset and the generated feature maps.
☆11Jun 2, 2022Updated 4 years ago
BitGeek29 / awesome-abandoned-research
View on GitHub
☆15Jun 19, 2025Updated last year