shunzh / Code-AI-Tree-Search
☆105Updated 3 months ago
Related projects ⓘ
Alternatives and complementary repositories for Code-AI-Tree-Search
- ☆75Updated last year
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆79Updated last year
- Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"☆96Updated 10 months ago
- ☆85Updated 11 months ago
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆52Updated 2 months ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆43Updated 10 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆95Updated 2 months ago
- Code for the paper <SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning>☆44Updated last year
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆52Updated last month
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆96Updated last week
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…☆104Updated 5 months ago
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆143Updated 10 months ago
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆72Updated 2 months ago
- ☆23Updated 4 months ago
- CodeUltraFeedback: aligning large language models to coding preferences☆65Updated 4 months ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆156Updated 6 months ago
- ☆39Updated 5 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆48Updated 7 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆127Updated last month
- Can Language Models Solve Olympiad Programming?☆100Updated 3 months ago
- A repository for transformer critique learning and generation☆85Updated 11 months ago
- Reasoning with Language Model is Planning with World Model☆144Updated last year
- Chain-of-Hindsight, A Scalable RLHF Method☆218Updated last year
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆96Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆23Updated 10 months ago
- ☆98Updated 5 months ago
- Accepted by Transactions on Machine Learning Research (TMLR)☆118Updated last month
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆36Updated last year
- ☆44Updated last year
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆111Updated 3 weeks ago