[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation
☆49Dec 22, 2023Updated 2 years ago
Alternatives and similar repositories for odex
Users that are interested in odex are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Self-Optimization Improves the Efficiency of Code Generation☆14May 10, 2025Updated 9 months ago
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆12Oct 12, 2024Updated last year
- code for "Natural Language to Code Translation with Execution"☆41Nov 2, 2022Updated 3 years ago
- ☆33Updated this week
- Code generation from natural language with less prior and more monolingual data☆13Aug 24, 2021Updated 4 years ago
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆74Aug 31, 2024Updated last year
- ☆16Apr 9, 2021Updated 4 years ago
- ☆18Apr 15, 2024Updated last year
- [ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".☆267Oct 30, 2024Updated last year
- The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Te…☆32Jul 5, 2021Updated 4 years ago
- ☆54Aug 25, 2023Updated 2 years ago
- Open source code and data for AAAI 2022 Oral Paper "Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding"☆35May 26, 2024Updated last year
- Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"☆58Mar 20, 2024Updated last year
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆167Oct 11, 2024Updated last year
- ☆11Jul 20, 2021Updated 4 years ago
- OSWorld-Human: Benchmarking the Efficiency of Computer-Use Agents☆21Jan 6, 2026Updated 2 months ago
- Data and code for "DocPrompting: Generating Code by Retrieving the Docs" @ICLR 2023☆251Dec 15, 2023Updated 2 years ago
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)☆24Oct 10, 2023Updated 2 years ago
- [SUKI'22] Table Retrieval May Not Necessitate Table-Specific Model Design☆23Sep 23, 2022Updated 3 years ago
- We introduce FixEval , a dataset for competitive programming bug fixing along with a comprehensive test suite and show the necessity of e…☆26Aug 31, 2022Updated 3 years ago
- ☆43Jan 1, 2025Updated last year
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Jun 25, 2024Updated last year
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆64Oct 4, 2024Updated last year
- Align, a general text alignment function☆15Dec 7, 2023Updated 2 years ago
- Background materials for the article "Productivity Assessment of Neural Code Completion"☆13Jul 11, 2023Updated 2 years ago
- benchmarks for evaluating MT models☆11Jun 26, 2024Updated last year
- Code base of In-Context Learning for Dialogue State tracking