openai/human-eval-infilling

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/openai/human-eval-infilling)

openai / human-eval-infilling

Code for the paper "Efficient Training of Language Models to Fill in the Middle"

☆209

Alternatives and similar repositories for human-eval-infilling

Users that are interested in human-eval-infilling are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

gonglinyuan / safim
View on GitHub
☆49May 6, 2025Updated last year
nuprl / MultiPL-E
View on GitHub
A multi-programming language benchmark for LLMs
☆314Apr 12, 2026Updated 3 months ago
amazon-science / cceval
View on GitHub
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)
☆182Aug 15, 2025Updated 11 months ago
bigcode-project / bigcode-evaluation-harness
View on GitHub
A framework for the evaluation of autoregressive code generation language models.
☆1,055Jul 22, 2025Updated last year
Leolty / repobench
View on GitHub
✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024
☆214Aug 16, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
jamesmurdza / humaneval-results
View on GitHub
Evaluation results of code generation LLMs
☆32Sep 1, 2023Updated 2 years ago
openai / human-eval
View on GitHub
Code for the paper "Evaluating Large Language Models Trained on Code"
☆3,324Jan 17, 2025Updated last year
amazon-science / recode
View on GitHub
Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"
☆58Mar 20, 2024Updated 2 years ago
bigcode-project / bigcode-dataset
View on GitHub
☆497Aug 15, 2024Updated last year
nickrosh / evol-teacher
View on GitHub
Open Source WizardCoder Dataset
☆166Jul 12, 2023Updated 3 years ago
Princeton-SysML / kNNLM_privacy
View on GitHub
Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888
☆37Jun 10, 2024Updated 2 years ago
aixcoder-plugin / nl2code-dataset
View on GitHub
Aix-bench, the Java benchmark for code synthesis problem.
☆52Aug 19, 2022Updated 3 years ago
ZZR0 / CodeAttack
View on GitHub
Adversarial Attack for Pre-trained Code Models
☆10Jul 19, 2022Updated 4 years ago
amazon-science / mxeval
View on GitHub
☆113Jul 17, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
ntunlp / ExecEval
View on GitHub
A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.
☆64Oct 21, 2024Updated last year
tianyi-lab / Mosaic-IT
View on GitHub
[ACL'25] Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning
☆20Sep 27, 2025Updated 10 months ago
reddy-lab-code-research / PPOCoder
View on GitHub
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"
☆116Jan 9, 2024Updated 2 years ago
my-other-github-account / llm-humaneval-benchmarks
View on GitHub
☆86May 15, 2026Updated 2 months ago
dpfried / incoder
View on GitHub
Generative model for code infilling and synthesis
☆313Sep 9, 2023Updated 2 years ago
salesforce / CodeRL
View on GitHub
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (Neur…
☆573Jun 2, 2026Updated last month
Speakn0w / PlotCraft-Benchmark
View on GitHub
☆16Dec 10, 2025Updated 7 months ago
alibaba / SimCSE-with-CARDS
View on GitHub
Source code for SIGIR 2022 paper.
☆16Apr 25, 2022Updated 4 years ago
openai / code-align-evals-data
View on GitHub
☆28Jul 21, 2021Updated 5 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Etamin / TSED
View on GitHub
TSED with Flexible Parser
☆21Jan 22, 2026Updated 6 months ago
evalplus / evalplus
View on GitHub
Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024
☆1,785Oct 2, 2025Updated 9 months ago
AlphaPav / mem-kk-logic
View on GitHub
On Memorization of Large Language Models in Logical Reasoning
☆79Mar 29, 2025Updated last year
openai / lm-human-preferences
View on GitHub
Code for the paper Fine-Tuning Language Models from Human Preferences
☆1,393Jul 25, 2023Updated 3 years ago
shrivastavadisha / repo_level_prompt_generation
View on GitHub
☆127Apr 22, 2023Updated 3 years ago
bigcode-project / pii-lib
View on GitHub
Code for PII detection and redaction in code datasets
☆16Jan 24, 2023Updated 3 years ago
abacaj / code-eval
View on GitHub
Run evaluation on LLMs using human-eval benchmark
☆431Sep 12, 2023Updated 2 years ago
HKUNLP / ProGen
View on GitHub
[EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.
☆27Feb 4, 2023Updated 3 years ago
swj0419 / in-context-pretraining
View on GitHub
☆57Apr 11, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
bigcode-project / selfcodealign
View on GitHub
[NeurIPS'24] SelfCodeAlign: Self-Alignment for Code Generation
☆323Feb 24, 2025Updated last year
Heidelberg-NLP / CCKG
View on GitHub
Repository to create CCKGs from the paper "Similarity-weighted Construction of Contextualized Commonsense Knowledge Graphs for Knowledge-…
☆11May 23, 2025Updated last year
mahimanzum / FixEval
View on GitHub
We introduce FixEval , a dataset for competitive programming bug fixing along with a comprehensive test suite and show the necessity of e…
☆26Aug 31, 2022Updated 3 years ago
CoderEval / CoderEval
View on GitHub
A collection of practical code generation tasks and tests in open source projects. Complementary to HumanEval by OpenAI.
☆159Dec 25, 2024Updated last year
facebookresearch / dmae_st
View on GitHub
Directed masked autoencoders
☆14Mar 25, 2026Updated 4 months ago
bigcode-project / octopack
View on GitHub
🐙 OctoPack: Instruction Tuning Code Large Language Models
☆479Feb 5, 2025Updated last year
Michaelvll / llm-ie-benchmarks
View on GitHub
A collection of reproducible inference engine benchmarks
☆38Apr 22, 2025Updated last year