WooooDyy / BAPOView on GitHub
Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping" by Zhiheng Xi et al.
91Jan 29, 2026Updated last month

Alternatives and similar repositories for BAPO

Users that are interested in BAPO are comparing it to the libraries listed below

Sorting:

Are these results useful?