l294265421 / alpaca-rlhfLinks

Finetuning LLaMA with RLHF (Reinforcement Learning with Human Feedback) based on DeepSpeed Chat
114Updated 2 years ago

Alternatives and similar repositories for alpaca-rlhf

Users that are interested in alpaca-rlhf are comparing it to the libraries listed below

Sorting: