openai / code-align-evals-data
☆28Updated 3 years ago
Related projects ⓘ
Alternatives and complementary repositories for code-align-evals-data
- ☆50Updated 5 months ago
- Code for the paper "Efficient Training of Language Models to Fill in the Middle"☆168Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆44Updated 11 months ago
- ☆147Updated 3 years ago
- ☆101Updated 4 months ago
- A set of utilities for running few-shot prompting experiments on large-language models☆113Updated last year
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆115Updated last month
- ☆75Updated last year
- A hard gym for programming☆140Updated 4 months ago
- Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"☆97Updated 10 months ago
- Script for downloading GitHub.☆88Updated 4 months ago
- Scratchpad/Chain-of-Thought Prompts☆12Updated 2 years ago
- ☆23Updated 5 months ago
- ☆48Updated last year
- Language Models of Code are Few-Shot Commonsense Learners (EMNLP 2022)☆86Updated last year
- Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"☆49Updated 8 months ago
- distill chatGPT coding ability into small model (1b)☆24Updated last year
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆87Updated last year
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆79Updated last year
- A unified benchmark for math reasoning☆87Updated last year
- Data and code for the ICLR 2023 paper "Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning".☆145Updated 10 months ago
- CodeUltraFeedback: aligning large language models to coding preferences☆65Updated 4 months ago
- ☆81Updated 4 months ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆155Updated 6 months ago
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated 7 months ago
- Accepted by Transactions on Machine Learning Research (TMLR)☆120Updated last month
- ☆52Updated last year
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆52Updated last month
- ☆175Updated last year
- [EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code☆69Updated 5 months ago