Trae1ounG / BuPOView external linksLinks
[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies
☆59Feb 6, 2026Updated last week
Alternatives and similar repositories for BuPO
Users that are interested in BuPO are comparing it to the libraries listed below
Sorting:
- Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆29Sep 19, 2025Updated 4 months ago
- ☆20Dec 14, 2024Updated last year
- [AAAI 2025] Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks