huggingface / trl

Train transformer language models with reinforcement learning.
β˜†9,288Updated this week

Related projects: β“˜