amazon-science / ccevalLinks

CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)

☆159

Alternatives and similar repositories for cceval

Users that are interested in cceval are comparing it to the libraries listed below

Sorting:

Leolty / repobench
✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024
☆174Updated last year
facebookresearch / cruxeval
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
☆154Updated last year
ntunlp / ExecEval
A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.
☆56Updated last year
xlang-ai / DS-1000
[ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".
☆256Updated 11 months ago
ntunlp / xCodeEval
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
☆86Updated last year
seketeam / EvoCodeBench
An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories
☆63Updated last year
thunlp / DebugBench
The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".
☆83Updated last year
nuprl / MultiPL-E
A multi-programming language benchmark for LLMs
☆278Updated 2 months ago
amazon-science / mxeval
☆111Updated last year
shrivastavadisha / repo_level_prompt_generation
☆126Updated 2 years ago
CoderEval / CoderEval
A collection of practical code generation tasks and tests in open source projects. Complementary to HumanEval by OpenAI.
☆152Updated 9 months ago
FudanSELab / ClassEval
Benchmark ClassEval for class-level code generation.
☆145Updated 11 months ago
multi-swe-bench / multi-swe-bench
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
☆262Updated this week
floatai / HumanEval-XL
[LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization
☆38Updated 7 months ago
reddy-lab-code-research / PPOCoder
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"
☆117Updated last year
Zyq-scut / RLTF
Accepted by Transactions on Machine Learning Research (TMLR)
☆132Updated last year
microsoft / ReACC
Source codes for paper ”ReACC: A Retrieval-Augmented Code Completion Framework“
☆63Updated 3 years ago
reddy-lab-code-research / XLCoST
Code and data for XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence
☆82Updated 9 months ago
qishenghu / InstructCoder
InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw
☆62Updated last year
allanj / repo-level-codegen-papers
Repo-Level Code generation papers
☆214Updated 3 months ago
evo-eval / evoeval
EvoEval: Evolving Coding Benchmarks via LLM
☆76Updated last year
NL2Code / NL2Code.github.io
Large Language Models Meet NL2Code: A Survey
☆35Updated 11 months ago
evalplus / repoqa
RepoQA: Evaluating Long-Context Code Understanding
☆119Updated 11 months ago
bigcode-project / the-stack-v2
Code for the curation of The Stack v2 and StarCoder2 training data
☆117Updated last year
YerbaPage / Awesome-Repo-Level-Code-Generation
Must-read papers on Repository-level Code Generation & Issue Resolution 🔥
☆186Updated this week
DeepSoftwareAnalytics / RLCoder
Reinforcement Learning for Repository-Level Code Completion
☆40Updated last year
r2e-project / r2e
[ICML '24] R2E: Turn any GitHub Repository into a Programming Agent Environment
☆133Updated 6 months ago
ozyyshr / RepoGraph
Enhancing AI Software Engineering with Repository-level Code Graph
☆216Updated 6 months ago
open-compass / DevEval
A Comprehensive Benchmark for Software Development.
☆115Updated last year
code-rag-bench / code-rag-bench
CodeRAG-Bench: Can Retrieval Augment Code Generation?
☆156Updated 11 months ago