SparkJiao / dpo-trajectory-reasoning

[EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".
43Updated 2 months ago

Related projects

Alternatives and complementary repositories for dpo-trajectory-reasoning