yangzhch6 / ReSocratic
OptiBench and ReSocratic Synthesis Method
☆15Updated 4 months ago
Alternatives and similar repositories for ReSocratic:
Users that are interested in ReSocratic are comparing it to the libraries listed below
- the training and inference code and data for LLMOPT☆18Updated 2 weeks ago
- Code for paper: End-to-end Stochastic Optimization with Energy-based Model☆16Updated 2 years ago
- Official implementation of the paper "Chain-of-Experts: When LLMs Meet Complex Operation Research Problems"☆74Updated this week
- GenRM-CoT: Data release for verification rationales☆47Updated 4 months ago
- Model Selection with Large Language Models for Reasoning (EMNLP2023 Findings)☆29Updated last year
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆29Updated 8 months ago
- Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering☆55Updated 2 months ago
- Code & data for ICLR 2024 spotlight paper: 🍯MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data☆39Updated 8 months ago
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆48Updated last year
- ☆15Updated 6 months ago
- Code for the paper LEGO-Prover: Neural Theorem Proving with Growing Libraries☆58Updated 11 months ago
- ☆26Updated last month
- ☆40Updated 2 weeks ago
- Official code and data repository of MathChat: MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Inte…☆16Updated 8 months ago
- ☆23Updated last year
- This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or re…☆23Updated 5 months ago
- ORLM: Training Large Language Models for Optimization Modeling☆88Updated 3 months ago
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆52Updated 4 months ago
- ☆23Updated 5 months ago
- ☆28Updated 3 months ago
- Code for paper "Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System"☆52Updated 3 months ago
- Natural Language Reinforcement Learning☆72Updated 2 months ago
- ☆25Updated 9 months ago
- ☆41Updated 3 months ago
- ☆30Updated last year
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆115Updated 5 months ago
- Source codes for "Preference-grounded Token-level Guidance for Language Model Fine-tuning" (NeurIPS 2023).☆15Updated last month
- Direct preference optimization with f-divergences.☆13Updated 3 months ago
- e☆22Updated last week