microsoft / coderec_programming_statesLinks
Code and Data for: Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming
☆32Updated last year
Alternatives and similar repositories for coderec_programming_states
Users that are interested in coderec_programming_states are comparing it to the libraries listed below
Sorting:
- ☆124Updated 2 years ago
- CodeBERTScore: an automatic metric for code generation, based on BERTScore☆196Updated last year
- [EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code☆76Updated last year
- ✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024☆169Updated 11 months ago
- CodeMind is a generic framework for evaluating inductive code reasoning of LLMs. It is equipped with a static analysis component that ena…☆39Updated 3 months ago
- Releasing code for "ReCode: Robustness Evaluation of Code Generation Models"☆52Updated last year
- ☆110Updated last year
- Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions☆47Updated last year
- EvoEval: Evolving Coding Benchmarks via LLM☆76Updated last year
- Source Code Data Augmentation for Deep Learning: A Survey.☆67Updated last year
- TDD-Bench-Verified is a new benchmark for generating test cases for test-driven development (TDD)☆21Updated last month
- ☆67Updated last year
- repo for the paper titled “CodeGen4Libs: A Two-Stage Approach for Library-Oriented Code Generation”☆14Updated last year
- Code for "StructCoder: Structure-Aware Transformer for Code Generation"☆76Updated last year
- CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)☆153Updated last year
- ☆22Updated last month
- [FORGE 2025] Graph-based method for end-to-end code completion with context awareness on repository☆64Updated 11 months ago
- APIBench is a benchmark for evaluating the performance of API recommendation approaches released in the paper "Revisiting, Benchmarking a…☆60Updated 2 years ago
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆52Updated last week
- Official code for the paper "CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules"☆45Updated 6 months ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆152Updated 9 months ago
- CodeSage: Code Representation Learning At Scale (ICLR 2024)☆111Updated 9 months ago
- A collection of recent papers, benchmarks and datasets of AI4Code domain.☆58Updated last year
- ☆17Updated last year
- ☆28Updated 2 years ago
- Data and code for "DocPrompting: Generating Code by Retrieving the Docs" @ICLR 2023☆248Updated last year
- Training language models to make programs faster☆91Updated last year
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆223Updated last year
- Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.☆201Updated last month
- ☆20Updated 2 years ago