RLHFlow / RAFT

This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or rejection sampling fine-tuning.
16Updated 2 months ago

Related projects

Alternatives and complementary repositories for RAFT