dashends / CodeSyntaxLinks
Code and dataset for EMNLP 2022 Findings paper "Benchmarking Language Models for Code Syntax Understanding"
☆14Updated 2 years ago
Alternatives and similar repositories for CodeSyntax
Users that are interested in CodeSyntax are comparing it to the libraries listed below
Sorting:
- Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"☆52Updated last year
- ☆28Updated this week
- Training and Benchmarking LLMs for Code Preference.☆33Updated 8 months ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆48Updated last year
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆67Updated 10 months ago
- Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"☆114Updated last year
- code for "Natural Language to Code Translation with Execution"☆41Updated 2 years ago
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆33Updated last year
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆61Updated 9 months ago
- ☆57Updated last year
- [ICLR 2021] "Generating Adversarial Computer Programs using Optimized Obfuscations" by Shashank Srikant, Sijia Liu, Tamara Mitrovska, Shi…☆30Updated 3 years ago
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆89Updated 2 years ago
- Official implementation of Privacy Implications of Retrieval-Based Language Models (EMNLP 2023). https://arxiv.org/abs/2305.14888☆36Updated last year
- This repository includes a benchmark and code for the paper "Evaluating LLMs at Detecting Errors in LLM Responses".☆30Updated 11 months ago
- ☆14Updated last year
- ☆119Updated last year
- ☆78Updated 3 months ago
- Replication package for ISSTA2023 paper - Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond☆21Updated 2 years ago
- CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure, EMNLP 2022☆13Updated 2 years ago
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆51Updated last month
- CodeUltraFeedback: aligning large language models to coding preferences☆71Updated last year
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆149Updated 9 months ago
- We have released the code and demo program required for LLM with self-verification☆60Updated last year
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆12Updated 4 months ago
- ☆44Updated 10 months ago
- ☆45Updated 3 weeks ago
- ☆27Updated 6 months ago
- We introduce FixEval , a dataset for competitive programming bug fixing along with a comprehensive test suite and show the necessity of e…☆23Updated 2 years ago
- ☆110Updated last year
- xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval☆84Updated 10 months ago