NEUIR / COASTLinks
Official repository for the paper "COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis".
☆17Updated 7 months ago
Alternatives and similar repositories for COAST
Users that are interested in COAST are comparing it to the libraries listed below
Sorting:
- An Evolving Code Generation Benchmark Aligned with Real-world Code Repositories☆63Updated last year
- ☆31Updated 4 months ago
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆20Updated 8 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆137Updated last year
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs (ACL 2024)☆71Updated 4 months ago
- this is an implementation for the paper Improve Mathematical Reasoning in Language Models by Automated Process Supervision from google de…☆41Updated 2 months ago
- Repo-Level Code generation papers☆211Updated 2 months ago
- [ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving☆21Updated last month
- [ICLR 2025] This is the code repo for our ICLR’25 paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rew…☆46Updated 7 months ago
- The repository for paper "DebugBench: "Evaluating Debugging Capability of Large Language Models".☆81Updated last year
- ☆38Updated last month
- ☆20Updated 9 months ago
- Must-read papers on Repository-level Code Generation & Issue Resolution 🔥☆163Updated last week
- ☆24Updated 2 years ago
- ☆12Updated last month
- Large Language Models(LLMs) of Code☆18Updated 2 years ago
- [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?☆34Updated 3 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆56Updated 9 months ago
- Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets and…☆50Updated 4 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆246Updated last month
- The official GitHub repository of the paper "Recent advances in large langauge model benchmarks against data contamination: From static t…☆45Updated 2 weeks ago
- codes for "Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models"☆10Updated 7 months ago
- [EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".☆82Updated 8 months ago
- [ACL25] FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation☆26Updated last month
- ☆21Updated last year
- LeetCode Training and Evaluation Dataset☆34Updated 5 months ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆30Updated 4 months ago
- Source code of our paper MIND, ACL 2024 Long Paper☆50Updated last year
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆62Updated 8 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆139Updated 10 months ago