sanowl / Self-Correcting-LLM--Reinforcement-Learning-

This my attempt to create Self-Correcting-LLM based on the paper Training Language Models to Self-Correct via Reinforcement Learning by google
โ˜†16Updated last week

Related projects โ“˜

Alternatives and complementary repositories for Self-Correcting-LLM--Reinforcement-Learning-