YerbaPage / SWE-DebateLinks
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
☆22Updated last month
Alternatives and similar repositories for SWE-Debate
Users that are interested in SWE-Debate are comparing it to the libraries listed below
Sorting:
- Reinforcement Learning for Repository-Level Code Completion☆42Updated last year
- ☆12Updated 9 months ago
- Training and Benchmarking LLMs for Code Preference.☆37Updated last year
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆101Updated 2 months ago
- Must-read papers on Repository-level Code Generation & Issue Resolution 🔥☆223Updated last week
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆85Updated last year
- Baselines for all tasks from Long Code Arena benchmarks 🏟️☆38Updated 8 months ago
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆27Updated 7 months ago
- An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories☆67Updated last year
- A Lightweight Visual Reasoning Benchmark for Evaluating Large Multimodal Models through Complex Diagrams in Coding Tasks☆13Updated 9 months ago
- ☆40Updated last month
- ☆68Updated last year
- ☆54Updated last year
- A comprehensive code domain benchmark review of LLM researches.☆175Updated 3 months ago
- ☆15Updated last year
- ☆30Updated last year
- XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts☆35Updated last year
- SWE-Exp: Experience-Driven Software Issue Resolution☆36Updated 2 months ago
- Code and data release of the paper Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows☆14Updated last year
- CodeRAG-Bench: Can Retrieval Augment Code Generation?☆162Updated last year
- Repo-Level Code generation papers☆226Updated this week
- Reproducing R1 for Code with Reliable Rewards☆278Updated 7 months ago
- StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback☆73Updated last year
- CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)☆164Updated 4 months ago
- SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆120Updated last month
- Implementation and datasets for "Training Language Models to Generate Quality Code with Program Analysis Feedback"☆35Updated 5 months ago
- Benchmark ClassEval for class-level code generation.☆145Updated last year
- ☆44Updated last month
- Official repository for the paper "COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis".☆17Updated 10 months ago
- InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw☆64Updated last year