xiwenc1 / DRA-GRPOLinks

Official code for the paper: DRA-GRPO: Exploring Diversity-Aware Reward Adjustment for R1-Zero-Like Training of Large Language Models
15Updated last week

Alternatives and similar repositories for DRA-GRPO

Users that are interested in DRA-GRPO are comparing it to the libraries listed below

Sorting: