facebookresearch / RLCD

Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment
63Updated last year

Related projects: