qishenghu / InstructCoderLinks

InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw

☆62

Alternatives and similar repositories for InstructCoder

Users that are interested in InstructCoder are comparing it to the libraries listed below

Sorting:

CodeEditorBench / CodeEditorBench
☆49Updated last year
crux-eval / eval-arena
☆28Updated 2 weeks ago
bigcode-project / astraios
Astraios: Parameter-Efficient Instruction Tuning Code Language Models
☆59Updated last year
facebookresearch / cruxeval
CRUXEval: Code Reasoning, Understanding, and Execution Evaluation
☆151Updated 9 months ago
WHGTyen / BIG-Bench-Mistake
A dataset of LLM-generated chain-of-thought steps annotated with mistake location.
☆81Updated 11 months ago
ntunlp / ExecEval
A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.
☆56Updated 9 months ago
evalplus / repoqa
RepoQA: Evaluating Long-Context Code Understanding
☆113Updated 9 months ago
zorazrw / odex
[EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation
☆48Updated last year
Ablustrund / APPS_Plus
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
☆67Updated 11 months ago
amazon-science / llm-code-preference
Training and Benchmarking LLMs for Code Preference.
☆34Updated 8 months ago
SparksofAGI / MHPP
☆32Updated last month
rmshin / llm-mcts
☆41Updated last year
princeton-nlp / LLMBar
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following
☆127Updated last year
OSU-NLP-Group / llm-planning-eval
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Updated last year
shunzh / Code-AI-Tree-Search
☆119Updated last year
microsoft / SWE-bench-Live
🚀 SWE-bench Goes Live!
☆103Updated last week
R2E-Gym / R2E-Gym
Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents
☆136Updated 2 weeks ago
ntunlp / xCodeEval
xCodeEval: A Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
☆86Updated 10 months ago
GAIR-NLP / OlympicArena
[NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
☆102Updated 4 months ago
niansong1996 / lever
Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)
☆89Updated 2 years ago
martin-wey / CodeUltraFeedback
CodeUltraFeedback: aligning large language models to coding preferences
☆71Updated last year
SalesforceAIResearch / swecomm
☆27Updated 6 months ago
xlang-ai / EVOR
☆67Updated 7 months ago
Zyq-scut / RLTF
Accepted by Transactions on Machine Learning Research (TMLR)
☆130Updated 9 months ago
Leolty / repobench
✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024
☆168Updated 11 months ago
open-compass / DevEval
A Comprehensive Benchmark for Software Development.
☆111Updated last year
hkust-nlp / llm-compression-intelligence
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
☆139Updated 10 months ago
nyu-mll / ILF-for-code-generation
☆78Updated 4 months ago
hughbzhang / o1_inference_scaling_laws
Replicating O1 inference-time scaling laws
☆89Updated 8 months ago
THUDM / NaturalCodeBench
NaturalCodeBench (Findings of ACL 2024)
☆68Updated 9 months ago