yuniaXian / ppo_llm_DeepSpeedLinks
Customized llm PPO (reinforcement learning) pipeline with deepSpeed. For Amex external usage. Training reward model, actor-critic models with referenced supervised fine-tuned model
☆1Updated last year
Alternatives and similar repositories for ppo_llm_DeepSpeed
Users that are interested in ppo_llm_DeepSpeed are comparing it to the libraries listed below
Sorting:
- Implement of Knowledge graph to text model. Integrated with Fairseq (Meta Fair research library))☆2Updated last year
- ☆1Updated last year
- find channel admin count☆21Updated 7 years ago
- Typescript command handler☆23Updated last year
- extension of SMx crypto support for go standard lib☆2Updated 2 years ago
- Full Stack Lottery Web Application☆2Updated 6 months ago
- A fantastically simple tagging component for your React projects☆3Updated 6 years ago