Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
☆606Oct 7, 2025Updated 6 months ago
Alternatives and similar repositories for nano-aha-moment
Users that are interested in nano-aha-moment are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SafeArena is a benchmark for assessing the harmful capabilities of web agents☆21Apr 23, 2025Updated 11 months ago
- Implementing DeepSeek R1's GRPO algorithm from scratch☆1,834Apr 18, 2025Updated last year
- Simple repository for training small reasoning models