WeiXiongUST / Building-Math-Agents-with-Multi-Turn-Iterative-Preference-Learning

This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DPO and KTO.
15Updated last week

Related projects

Alternatives and complementary repositories for Building-Math-Agents-with-Multi-Turn-Iterative-Preference-Learning