firechecking / CleanRL

Reinforcement Learning algorithms and use-cases, including DQN, PG, A3C, PPO etc. and RLHF, AlphaZero implementations. Designed for clarity, ease of use, and educational purposes.
26Updated 5 months ago

Related projects

Alternatives and complementary repositories for CleanRL