thu-coai / CodePlanLinks
☆15Updated 7 months ago
Alternatives and similar repositories for CodePlan
Users that are interested in CodePlan are comparing it to the libraries listed below
Sorting:
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆50Updated 5 months ago
- ☆47Updated 5 months ago
- ☆36Updated last month
- ☆56Updated 7 months ago
- ☆36Updated 8 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32Updated last year
- ☆82Updated last year
- ☆20Updated 7 months ago
- Source code of "Reasons to Reject? Aligning Language Models with Judgments"☆58Updated last year
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆25Updated 2 months ago
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆48Updated 11 months ago
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆28Updated 5 months ago
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆26Updated last year
- Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales☆32Updated last year
- ☆49Updated last year
- The paper list of multilingual pre-trained models (Continual Updated).☆22Updated 11 months ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆40Updated 8 months ago
- Automatic prompt optimization framework for multi-step agent tasks.☆32Updated 6 months ago
- The implementation of paper "LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Fee…☆39Updated 10 months ago
- ☆42Updated 2 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆37Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆47Updated last year
- This repository collects research papers on learning from rewards in the context of post-training and test-time scaling of large language…☆37Updated 3 weeks ago
- Official completion of “Training on the Benchmark Is Not All You Need”.☆32Updated 5 months ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆63Updated 7 months ago
- ☆16Updated 10 months ago
- ☆32Updated 2 weeks ago
- Benchmarking Benchmark Leakage in Large Language Models☆51Updated last year
- Reformatted Alignment☆114Updated 8 months ago
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆102Updated 4 months ago