RethinkFun/trian_ppo

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/RethinkFun/trian_ppo)

RethinkFun / trian_ppo

☆147

Alternatives and similar repositories for trian_ppo

Users that are interested in trian_ppo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

RethinkFun / sft
View on GitHub
☆69Aug 23, 2024Updated last year
RethinkFun / LLM
View on GitHub
☆141Aug 8, 2024Updated last year
owenliang / qwen-dpo
View on GitHub
通义千问的DPO训练
☆66Sep 21, 2024Updated last year
KMnO4-zx / tiny-llm
View on GitHub
☆34Jul 8, 2025Updated last year
injadlu / DAMA
View on GitHub
[ICML 2025] Official code of "DAMA: Data- and Model-aware Alignment of Multi-modal LLMs"
☆16May 24, 2025Updated last year
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
Darren-greenhand / LLaVA_OpenVLA
View on GitHub
Converted the training data of OpenVLA into general form of multimodal training instructions and then used with LLaVA-OneVision
☆24Jan 12, 2025Updated last year
emirhanbayar / Fast-StrongSORT
View on GitHub
StrongSORT with Selective Feature Extraction Mechanism
☆16Sep 25, 2024Updated last year
wyf3 / llm_related
View on GitHub
复现大模型相关算法及一些学习记录
☆3,452Jul 2, 2026Updated last week
chunhuizhang / llm_rl
View on GitHub
llm & rl
☆289Oct 24, 2025Updated 8 months ago
songyiren725 / EasyText
View on GitHub
Code Implementation of the Paper: EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering
☆56Jun 16, 2025Updated last year
Zeyi-Lin / easy-r1
View on GitHub
Train deepseek r1-like reasoning LLM with ease | 轻松训练1个deepseek r1类的推理LLM
☆20Feb 15, 2025Updated last year
hyqshr / Pybullet-Gym-Drones
View on GitHub
Drone reinforcement learning with multiple tasks in pybullet and OpenAI Gym environment
☆15Sep 17, 2023Updated 2 years ago
Embodied-VideoAgent / embodied-videoagent
View on GitHub
☆49Aug 26, 2025Updated 10 months ago
CRIPAC-DIG / LogicCheckGPT
View on GitHub
[ACL 2024] Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models. Detect and mitigate object hallucinatio…
☆25Jan 31, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
StarWalkin / UI-NEXUS
View on GitHub
This is the official repository of the paper "Atomic-to-Compositional Generalization for Mobile Agents with A New Benchmark and Schedulin…
☆14Jul 27, 2025Updated 11 months ago
flashinfer-ai / debug-print
View on GitHub
Debug print operator for cudagraph debugging
☆18Aug 2, 2024Updated last year
cyugai / G1_AMO_control
View on GitHub
Unitree G1 VR teleoperation
☆27Oct 10, 2025Updated 8 months ago
TUDelft-DataDrivenControl / COFLEX
View on GitHub
COntrol scheme for large and FLEXible wind turbines
☆13Nov 26, 2024Updated last year
wilcoschoneveld / opticflow
View on GitHub
Optical flow with convolutional neural networks for vision-based guidance of UAS
☆11Aug 23, 2017Updated 8 years ago
kyegomez / FastFF
View on GitHub
Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"
☆16Nov 11, 2024Updated last year
ChrisIsKing / zero-shot-text-classification
View on GitHub
Repository for the Findings of ACL'23 paper Label Agnostic Pre-training for Zero-shot Text Classification
☆13Aug 10, 2023Updated 2 years ago
rllab-snu / Spectral-Risk-Constrained-RL
View on GitHub
Official Github Repository for "Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees". (NeurIPS 2024)
☆11Nov 30, 2025Updated 7 months ago
MIT-REALM / efppo
View on GitHub
☆11Mar 5, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
wlll123456 / study_rlhf
View on GitHub
☆108Jul 24, 2025Updated 11 months ago
liangyuwang / Tiny-DeepSpeed
View on GitHub
Tiny-DeepSpeed, a minimalistic re-implementation of the DeepSpeed library
☆52Aug 20, 2025Updated 10 months ago
cooelf / dive-into-llms
View on GitHub
Dive-into-LLMs Tutorial for Beginners
☆26May 14, 2024Updated 2 years ago
thingsboard / rule-node-examples-ui-ngx
View on GitHub
☆13Jan 19, 2026Updated 5 months ago
sdiehl / prm
View on GitHub
Library for training process reward models
☆29Jun 3, 2025Updated last year
agil27 / Quentain
View on GitHub
An implementation of the Poker Game Guandan popular among Jiangsu and Anhui in China
☆11Jan 14, 2023Updated 3 years ago
ManuelFay / Tutorials
View on GitHub
Quick Notebook Tutorials
☆36Jul 17, 2025Updated 11 months ago
mocibb / cs336
View on GitHub
☆96Jul 20, 2025Updated 11 months ago
zhangkai0425 / SGEMM-HPC
View on GitHub
Implementation and optimization of matrix multiplication on single CPU (HPC-THU-2023-Autumn)
☆18Feb 27, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
SkalskiP / segment-anything-2
View on GitHub
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained mode…
☆13Jul 30, 2024Updated last year
mahaitongdae / Safety_Index_Synthesis
View on GitHub
Code for L4DC 2022 paper: Joint Synthesis of Safety Certificate and Safe Control Policy Using Constrained Reinforcement Learning.
☆15Jul 31, 2023Updated 2 years ago
WaelYasmina / blendertothree
View on GitHub
☆11May 21, 2023Updated 3 years ago
yuanzhoulvpi2017 / SentenceEmbedding
View on GitHub
☆121Jun 30, 2024Updated 2 years ago
caowm / thingsboard-widget
View on GitHub
thingsboard widget
☆12Aug 8, 2022Updated 3 years ago
henryhcliu / Compliant-Grabbing-of-Dexterous-Hand-in-Complex-Scenarios-for-a-Visual-Guided-Collaborative-Robot
View on GitHub
This repository contains the necessary codes driving the InspireHand Dexterous Hand (right one) to open and close based on the feedback o…
☆16Jan 9, 2023Updated 3 years ago
anthony-wss / glm-4-voice-finetune
View on GitHub
☆14Apr 4, 2025Updated last year