floatai / HumanEval-XLLinks

[LREC-COLING'24] HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization

☆38

Alternatives and similar repositories for HumanEval-XL

Users that are interested in HumanEval-XL are comparing it to the libraries listed below

Sorting:

FudanSELab / ClassEval
Benchmark ClassEval for class-level code generation.
☆145Updated last year
amazon-science / cceval
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)
☆159Updated 2 months ago
microsoft / ReACC
Source codes for paper ”ReACC: A Retrieval-Augmented Code Completion Framework“
☆63Updated 3 years ago
seketeam / EvoCodeBench
An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories
☆63Updated last year
ise-uiuc / xft
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
☆35Updated last year
thunlp / DebugBench
The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".
☆83Updated last year
amazon-science / mxeval
☆111Updated last year
CoderEval / CoderEval
A collection of practical code generation tasks and tests in open source projects. Complementary to HumanEval by OpenAI.
☆152Updated 10 months ago
tongye98 / Awesome-Code-Benchmark
A comprehensive code domain benchmark review of LLM researches.
☆125Updated last month
sola-st / crystalbleu
☆32Updated 2 years ago
adf1178 / PT4Code
☆46Updated 3 years ago
evo-eval / evoeval
EvoEval: Evolving Coding Benchmarks via LLM
☆76Updated last year
reddy-lab-code-research / XLCoST
Code and data for XLCoST: A Benchmark Dataset for Cross-lingual Code Intelligence
☆83Updated 9 months ago
YihongDong / CDD-TED4LLMs
☆15Updated 10 months ago
Leolty / repobench
✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024
☆174Updated last year
allanj / repo-level-codegen-papers
Repo-Level Code generation papers
☆215Updated 3 months ago
lt-asset / REPOCOD
For our ACL25 Paper: Can Language Models Replace Programmers? RepoCod Says ‘Not Yet’ - by Shanchao Liang and Yiran Hu and Nan Jiang and L…
☆22Updated last month
DeepSoftwareAnalytics / RLCoder
Reinforcement Learning for Repository-Level Code Completion
☆40Updated last year
nju-websoft / DraCo
Dataflow-guided retrieval augmentation for repository-level code completion, ACL 2024 (main)
☆28Updated 7 months ago
ntunlp / ExecEval
A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.
☆56Updated last year
amazon-science / recode
Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"
☆52Updated last year
facebookresearch / cruxeval
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
☆154Updated last year
JetBrains-Research / lca-baselines
Baselines for all tasks from Long Code Arena benchmarks 🏟️
☆35Updated 6 months ago
ZZR0 / ISSTA22-CodeStudy
This repo illustrates how to evaluate the artifacts in the paper An Extensive Study on Pre-trained Models for Program Understanding and G…
☆25Updated 3 years ago
swtheing / WizardCoder_Instruct_Generator
Generate the WizardCoder Instruct from the CodeAlpaca
☆21Updated 2 years ago
k4black / codebleu
Pip compatible CodeBLEU metric implementation available for linux/macos/win
☆117Updated 6 months ago
reddy-lab-code-research / PPOCoder
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"
☆117Updated last year
NathanaelBeau / CodeInsight
The CodeInsight dataset is designed for code generation tasks, providing developers with expert-curated examples that bridge the gap betw…
☆12Updated last year
xlang-ai / DS-1000
[ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".
☆256Updated 11 months ago
amazon-science / cocomic
CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context
☆17Updated 2 months ago