hkust-nlp / simpleRL-reason

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
3,223Updated this week

Alternatives and similar repositories for simpleRL-reason:

Users that are interested in simpleRL-reason are comparing it to the libraries listed below