OctopusMind/RLHF_PPO

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/OctopusMind/RLHF_PPO)

OctopusMind / RLHF_PPO

ppo算法实现

☆41

Alternatives and similar repositories for RLHF_PPO

Users that are interested in RLHF_PPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

abnerjacobsen / fastapi-mvc-loguru-demo
View on GitHub
Demo app with Loguru logging, async middleware to generate X-request-Id. Works with Gunicorn or Uvicorn, and is safe to use with async/th…
☆10Feb 2, 2022Updated 4 years ago
GHamrouni / OptimalCrop
View on GitHub
Crop the image by locating the interesting parts.
☆15Apr 12, 2017Updated 9 years ago
m-shilpa / Transformer_Memory_As_A_Differentiable_Search_Index
View on GitHub
Implementation of the paper by Google, Transformer Memory As A Differentiable Search Index
☆16May 27, 2022Updated 4 years ago
lanpay-lulu / deeplearning
View on GitHub
☆10Jan 4, 2017Updated 9 years ago
Attila94 / CEConv
View on GitHub
Official repository for Color Equivariant Convolutional Networks.
☆10Nov 16, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Lingyun0419 / CVPT
View on GitHub
Cross Visual Prompt Tuning [ICCV 2025]
☆13Aug 3, 2025Updated 11 months ago
SongDark / Mecos-tf
View on GitHub
Meta-learning-based Cold-Start Sequential Recommendation
☆16May 25, 2021Updated 5 years ago
shanksXU / BaiduBodyAnalyzeDemo
View on GitHub
百度人体分析Demo：人体关键点、人体属性、手势识别、人像分割、人流量统计、驾驶行为分析（邀测）、人流量统计动态版（邀测）
☆15Nov 29, 2018Updated 7 years ago
apachecn / awesome-gb-dev-zh
View on GitHub
GB 开发资源列表
☆18Dec 14, 2022Updated 3 years ago
asaddi / f5-tts-serve
View on GitHub
A simple wrapper around "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" that provides an OpenAI-compatibl…
☆14Feb 7, 2025Updated last year
RUCAIBox / MML
View on GitHub
☆16Mar 30, 2023Updated 3 years ago
RQuispeC / saliency-semantic-parsing-reid
View on GitHub
Official implementation of "Improved Person Re-Identification Based on Saliency and Semantic Parsing with Deep Neural Network Models" IMA…
☆14Dec 23, 2019Updated 6 years ago
KiraYeetar / OneTrans_Tensorlfow
View on GitHub
OneTrans from ByteDance. (Tensorflow)
☆18Jan 20, 2026Updated 6 months ago
lucheng2 / uestc-cv-yangyang-homework
View on GitHub
电子科技大学高级计算机视觉课程的作业代码
☆13Sep 5, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yeonjun-in / torch-SP-AGCL
View on GitHub
The official source code for Similarity Preserving Adversarial Graph Contrastive Learning (SP-AGCL) at KDD 2023.
☆23Jan 18, 2024Updated 2 years ago
CroitoruAlin / dlcr
View on GitHub
☆17Feb 24, 2025Updated last year
h9-tec / Qwen3_chat_local
View on GitHub
☆10Apr 30, 2025Updated last year
lingxiao-he / Deep-Spatial-Feature-Reconstruction-for-Partial-Person-Re-identification
View on GitHub
DSR
☆15Apr 25, 2018Updated 8 years ago
wouterkool / ancestral-gumbel-top-k-sampling
View on GitHub
Ancestral Gumbel-Top-k Sampling
☆24Apr 11, 2020Updated 6 years ago
TungChintao / SkiLa
View on GitHub
Official codes of "Sketch-in-Latents: Eliciting Unified Reasoning in MLLMs"
☆17Feb 15, 2026Updated 5 months ago
ENCCS / gnn_transformers_notebooks
View on GitHub
Notebooks for the ENCCS Graph Neural Networks and Transformers workshop
☆16Apr 19, 2023Updated 3 years ago
eda-ricercatore / Modica-SRAM
View on GitHub
Design of a 32-kbit synchronous SRAM with 32-bit words, using 180 nm process technology. Developed MATLAB scripts to evaluate architectu…
☆16Apr 28, 2021Updated 5 years ago
Complicateddd / PaddlePL
View on GitHub
山东省第二届数据应用创新创业大赛-主赛场-检验报告单识别-Baseline
☆13Jan 15, 2021Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago
alycialee / beyond-scale-language-data-diversity
View on GitHub
☆13Updated this week
nzjin / awesome_moe
View on GitHub
The collections of MOE (Mixture Of Expert) papers, code and tools, etc.
☆12Mar 15, 2024Updated 2 years ago
duanyiqun / Auto-ReID-Fast
View on GitHub
A pytorch implementation of using DARTS to search better structure for Re-ID
☆53Apr 20, 2021Updated 5 years ago
alon-albalak / online-data-mixing
View on GitHub
An implementation of online data mixing for the Pile dataset, based on the GPT-NeoX library.
☆14Jan 9, 2024Updated 2 years ago
Furyton / GR-as-MVDR
View on GitHub
[SIGIR'24] Generative Retrieval as Multi-Vector Dense Retrieval
☆36Oct 18, 2024Updated last year
Accio-Lab / SwimBird
View on GitHub
☆18Apr 9, 2026Updated 3 months ago
HaohanWang / PAR
View on GitHub
Implementation of Patch-wise Adversarial Regularization from "Learning Robust Global Representations by Penalizing Local Predictive Power…
☆18Oct 27, 2019Updated 6 years ago
tianyi-lab / MiP-Overthinking
View on GitHub
[COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
☆39Jun 5, 2025Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Hlufies / OneTinyRAG
View on GitHub
One Tiny RAG-Powered LLM Framework: Knowledge-Enhanced Generative AI Demo
☆35Jul 7, 2025Updated last year
Chocolate-Black / Langchain-MO-AI-Chat
View on GitHub
基于Langchain-Chatchat以及BERT-VITS2的AI对话系统
☆21Mar 20, 2024Updated 2 years ago
paulpjoby / DynGEM
View on GitHub
Btech S8 Main Project
☆24Jun 16, 2019Updated 7 years ago
CodeDuoGun / deepseek_lora
View on GitHub
基于deepseek、qwen3大模型，lora sft 医疗行业数据
☆15Apr 10, 2026Updated 3 months ago
Paul33333 / Agentic_RAG
View on GitHub
Local DeepSearch (Advantage: Low Threshold): an implementation of Agentic RAG based on DeepSeek-R1 API and Tavily API
☆17Jun 21, 2025Updated last year
fannie1208 / CaNet
View on GitHub
[WWW2024 Oral] "Graph Out-of-Distribution Generalization via Causal Intervention”.
☆26Oct 23, 2024Updated last year
feiyang-k / AutoScale
View on GitHub
Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…
☆14Aug 8, 2025Updated 11 months ago