jamesmurdza / humaneval-resultsView external linksLinks
Evaluation results of code generation LLMs
☆31Sep 1, 2023Updated 2 years ago
Alternatives and similar repositories for humaneval-results
Users that are interested in humaneval-results are comparing it to the libraries listed below
Sorting:
- Benchmark results from code generation with LLMs☆17Sep 1, 2023Updated 2 years ago
- ☆33Feb 2, 2026Updated 2 weeks ago
- ☆10Apr 15, 2023Updated 2 years ago
- Incremental Python parser for constrained generation of code by LLMs.☆18Sep 18, 2024Updated last year
- A tool for generating random, syntactically-correct Python code. Designed for fuzzing and testing of tools that parse Python code.☆23Sep 22, 2023Updated 2 years ago
- We introduce FixEval , a dataset for competitive programming bug fixing along with a comprehensive test suite and show the necessity of e…☆26Aug 31, 2022Updated 3 years ago
- Code and Data Repo for the CoNLL Paper -- Future Lens: Anticipating Subsequent Tokens from a Single Hidden State☆20Oct 24, 2025Updated 3 months ago
- This is the repository for the paper Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descripti…☆25Nov 18, 2022Updated 3 years ago
- We track and analyze the activity and performance of autonomous code agents in the wild☆49Dec 5, 2025Updated 2 months ago
- Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech☆13Jan 3, 2023Updated 3 years ago
- [NAACL 2024] A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models☆33Jun 10, 2024Updated last year
- A tool to paste Excel ranges to Reddit☆11Sep 20, 2025Updated 4 months ago
- Code for paper "SrcMarker: Dual-Channel Source Code Watermarking via Scalable Code Transformations" (IEEE S&P 2024)☆33Aug 8, 2024Updated last year
- Code and Data for: Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming☆33Feb 23, 2024Updated last year
- ☆11Feb 18, 2024Updated last year
- The first OpenSource Mafia Bot!☆10Oct 5, 2023Updated 2 years ago
- Comparative Study and Implementation of Five Factor Model and Myers-Briggs Type Indicator Model☆11Sep 28, 2023Updated 2 years ago
- User-friendly viewer for Parquet files☆10Jan 10, 2026Updated last month
- Multiprocessing in python☆10Aug 20, 2021Updated 4 years ago
- DOS Program Development☆12Nov 9, 2022Updated 3 years ago
- fine-tuning tutorial☆17Dec 13, 2025Updated 2 months ago
- Code and data for paper "A Semantic Invariant Robust Watermark for Large Language Models" accepted by ICLR 2024.☆36Nov 13, 2024Updated last year
- ☆14Dec 12, 2022Updated 3 years ago
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated 2 years ago
- YouTube Assistant☆12May 15, 2023Updated 2 years ago
- Dataset and codes for SEntFiN☆10May 31, 2023Updated 2 years ago
- ☆10Oct 11, 2022Updated 3 years ago
- Complexity analysis in Lean☆10Feb 5, 2024Updated 2 years ago
- DNH Werewolf Discord bot☆13Dec 19, 2024Updated last year
- ☆13Jul 8, 2024Updated last year
- 李鲁鲁老师的 Copilot-Python 学习。和ChatGPT等大语言模型协同进化。☆10Jun 3, 2025Updated 8 months ago
- LightGBM for handling label-imbalanced data with focal and weighted loss functions in binary and multiclass classification☆21Jan 29, 2026Updated 2 weeks ago
- ☆11Oct 31, 2021Updated 4 years ago
- 小鸡词典🐤的Alfred🎩插件 咯咯咯☆11Apr 19, 2023Updated 2 years ago
- ☆11Sep 15, 2025Updated 5 months ago
- Inspirational post ids collected from Reddit using pushift.io and RoBERTa☆10Jan 18, 2024Updated 2 years ago
- 记录有用的Git repos☆12Jul 28, 2024Updated last year
- An implementation of MSSRM method☆11Mar 23, 2023Updated 2 years ago
- Redis distributed lock implementation for Python based on Pub/Sub messaging☆11Nov 15, 2025Updated 3 months ago