[ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".
☆267Oct 30, 2024Updated last year
Alternatives and similar repositories for DS-1000
Users that are interested in DS-1000 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆34Mar 21, 2026Updated last week
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆49Dec 22, 2023Updated 2 years ago
- A framework for the evaluation of autoregressive code generation language models.☆1,024Jul 22, 2025Updated 8 months ago
- ☆17Dec 9, 2022Updated 3 years ago
- [ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI☆488Jan 3, 2026Updated 2 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- [ICLR 2023] Code for the paper "Binding Language Models in Symbolic Languages"☆325Aug 25, 2023Updated 2 years ago
- Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024☆1,700Oct 2, 2025Updated 5 months ago
- 🐙 OctoPack: Instruction Tuning Code Large Language Models☆479Feb 5, 2025Updated last year
- [ICLR 2023] Code for our paper "Selective Annotation Makes Language Models Better Few-Shot Learners"☆109Jul 15, 2023Updated 2 years ago
- code for "Natural Language to Code Translation with Execution"☆41Nov 2, 2022Updated 3 years ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆168Oct 11, 2024Updated last year
- Code for the paper "Evaluating Large Language Models Trained on Code"☆3,176Jan 17, 2025Updated last year
- Code for generating the JuICe dataset.☆37Oct 27, 2021Updated 4 years ago
- APPS: Automated Programming Progress Standard (NeurIPS 2021)☆521Jun 19, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- CliniQG4QA: Generating Diverse Questions for Domain Adaptation of Clinical Question Answering☆23Feb 26, 2021Updated 5 years ago
- [EMNLP 2022] Unifying and multi-tasking structured knowledge grounding with language models☆569Aug 22, 2023Updated 2 years ago
- A MBTI test on Large Language Model like GPT-3.☆27May 2, 2022Updated 3 years ago
- Contests based Dataset for Code Generation☆13Dec 11, 2022Updated 3 years ago
- ☆10Apr 15, 2023Updated 2 years ago
- Paper collections of methods that using language to interact with environment, including interact with real world, simulated world or WWW…☆129Jul 26, 2023Updated 2 years ago
- Lyra: A Benchmark for Turducken-Style Code Generation☆15Apr 22, 2022Updated 3 years ago
- A multi-programming language benchmark for LLMs☆299Jan 28, 2026Updated 2 months ago
- ☆14Aug 18, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆54Aug 25, 2023Updated 2 years ago
- Source code for paper: INTERVENOR : Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repairing☆30Nov 25, 2024Updated last year
- [ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"☆25Mar 28, 2024Updated 2 years ago
- Mapping Language to Code in a Programmatic Context☆80Jan 27, 2021Updated 5 years ago
- Official repository of the paper: Marking Code Without Breaking It: Code Watermarking for Detecting LLM-Generated Code (Findings of EACL …☆12Feb 11, 2026Updated last month
- The CodeInsight dataset is designed for code generation tasks, providing developers with expert-curated examples that bridge the gap betw…☆14Oct 22, 2024Updated last year
- [ICLR 2024] Lemur: Open Foundation Models for Language Agents☆557Oct 28, 2023Updated 2 years ago
- The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]☆99Apr 9, 2025Updated 11 months ago
- A repository containing the Jupyter notebook code generation benchmark.☆59Feb 9, 2022Updated 4 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)☆176Aug 15, 2025Updated 7 months ago
- Official repository for the paper "COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis".☆18Feb 19, 2025Updated last year
- [EACL'23] MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages☆23Feb 13, 2023Updated 3 years ago
- ☆675Nov 1, 2024Updated last year
- CodeXGLUE☆1,810Apr 23, 2024Updated last year
- 🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)☆24Oct 10, 2023Updated 2 years ago
- Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"☆823Jul 16, 2025Updated 8 months ago