Evaluation results of code generation LLMs
☆31Sep 1, 2023Updated 2 years ago
Alternatives and similar repositories for humaneval-results
Users that are interested in humaneval-results are comparing it to the libraries listed below
Sorting:
- Benchmark results from code generation with LLMs☆17Sep 1, 2023Updated 2 years ago
- ☆33Updated this week
- ☆10Apr 15, 2023Updated 2 years ago
- ☆15Nov 12, 2025Updated 3 months ago
- Code for PII detection and redaction in code datasets☆13Jan 24, 2023Updated 3 years ago
- ☆119Jul 17, 2024Updated last year
- A tool for generating random, syntactically-correct Python code. Designed for fuzzing and testing of tools that parse Python code.☆23Sep 22, 2023Updated 2 years ago
- Code and Data Repo for the CoNLL Paper -- Future Lens: Anticipating Subsequent Tokens from a Single Hidden State☆20Oct 24, 2025Updated 4 months ago
- We introduce FixEval , a dataset for competitive programming bug fixing along with a comprehensive test suite and show the necessity of e…☆26Aug 31, 2022Updated 3 years ago
- This is the repository for the paper Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descripti…☆25Nov 18, 2022Updated 3 years ago
- We track and analyze the activity and performance of autonomous code agents in the wild☆50Dec 5, 2025Updated 3 months ago
- ☆26Jul 19, 2022Updated 3 years ago
- A multi-programming language benchmark for LLMs☆298Jan 28, 2026Updated last month
- [NAACL 2024] A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆33Jun 10, 2024Updated last year
- Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech☆13Jan 3, 2023Updated 3 years ago
- Code and Data for: Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming☆33Feb 23, 2024Updated 2 years ago
- fine-tuning tutorial☆18Feb 20, 2026Updated 2 weeks ago
- User-friendly viewer for Parquet files☆10Updated this week
- ☆12Feb 18, 2024Updated 2 years ago
- The first OpenSource Mafia Bot!☆10Oct 5, 2023Updated 2 years ago
- Multiprocessing in python☆10Aug 20, 2021Updated 4 years ago
- A semidefinite programming solver for clustered low-rank SDPs☆14Updated this week
- Comparative Study and Implementation of Five Factor Model and Myers-Briggs Type Indicator Model☆11Sep 28, 2023Updated 2 years ago
- ☆11Jul 25, 2020Updated 5 years ago
- YouTube Assistant☆12May 15, 2023Updated 2 years ago
- Formalization of Arithmetization of Mathematics/Metamathematics☆13Mar 8, 2025Updated last year
- ☆10Jan 9, 2024Updated 2 years ago
- ☆13Jul 8, 2024Updated last year
- Show modal in react native apps without using Modal component.☆12Jan 12, 2021Updated 5 years ago
- DNH Werewolf Discord bot☆13Dec 19, 2024Updated last year
- An implementation of MSSRM method☆11Mar 23, 2023Updated 2 years ago
- 李鲁鲁老师的 Copilot-Python 学习。和ChatGPT等大语言模型协同进化。☆10Jun 3, 2025Updated 9 months ago
- Complexity analysis in Lean☆10Feb 5, 2024Updated 2 years ago
- Dataset and codes for SEntFiN☆10May 31, 2023Updated 2 years ago
- ☆11Dec 6, 2023Updated 2 years ago
- ☆12Jul 4, 2024Updated last year
- LightGBM for handling label-imbalanced data with focal and weighted loss functions in binary and multiclass classification☆21Jan 29, 2026Updated last month
- 小鸡词典🐤的Alfred🎩插件 咯咯咯☆11Apr 19, 2023Updated 2 years ago
- Inspirational post ids collected from Reddit using pushift.io and RoBERTa☆10Jan 18, 2024Updated 2 years ago