peng-weihan / SWE-QA-BenchView external linksLinks
☆45Jan 21, 2026Updated 3 weeks ago
Alternatives and similar repositories for SWE-QA-Bench
Users that are interested in SWE-QA-Bench are comparing it to the libraries listed below
Sorting:
- SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution☆24Nov 11, 2025Updated 3 months ago
- Interface for GenAI-Arena [NeurIPS24]☆17Feb 27, 2024Updated last year
- SWE-Exp: Experience-Driven Software Issue Resolution☆35Oct 17, 2025Updated 4 months ago
- LongCodeZip: Compress Long Context for Code Language Models [ASE2025]☆140Feb 5, 2026Updated last week
- ☆47Oct 28, 2025Updated 3 months ago
- [NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evaluating LLM repository-level test-generation☆69Jan 15, 2026Updated last month
- Multi-Granularity LLM Debugger [ICSE2026]☆96Jul 6, 2025Updated 7 months ago
- Repository of IPBench☆19Jan 4, 2026Updated last month
- ☆11Jul 17, 2023Updated 2 years ago
- A framework for few-shot evaluation of autoregressive language models.☆12Jul 14, 2025Updated 7 months ago
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆23Oct 23, 2025Updated 3 months ago
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- 本仓库为RoboMaster机甲大师比赛西安交通大学笃行战队视觉组的知识库,基于Obsidian☆15Nov 5, 2023Updated 2 years ago
- GBM implementation on Legate☆14Jan 28, 2026Updated 2 weeks ago
- A Swedish Natural Language Understanding Benchmark☆11Dec 12, 2025Updated 2 months ago
- ☆12Jan 11, 2026Updated last month
- [CVPR2024] Learning from Synthetic Human Group Activities☆14Feb 24, 2025Updated 11 months ago
- Reinforced Multi-LLM Agents training☆70Jan 18, 2026Updated 3 weeks ago
- [NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs☆49Nov 29, 2024Updated last year
- This is the implementation of the 4th place solution (yu4u's part) for RSNA 2024 Lumbar Spine Degenerative Classification at Kaggle.☆10Oct 11, 2024Updated last year
- Code for our project CROWN (Conversational Passage Ranking by Reasoning over Word Networks)☆10Jan 11, 2024Updated 2 years ago
- Encoder-decoders for translating different chemical formats.☆18Sep 17, 2025Updated 5 months ago
- ☆12Mar 5, 2025Updated 11 months ago
- ☆13Mar 2, 2025Updated 11 months ago
- Align, a general text alignment function☆15Dec 7, 2023Updated 2 years ago
- The official Python library for Openlayer, the Continuous Model Improvement Platform for AI. 📈☆16Feb 10, 2026Updated last week
- DependEval: a hierarchical benchmark for evaluating LLMs on repository-level code understanding across 8 programming languages.☆15Jul 28, 2025Updated 6 months ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- ☆11Oct 15, 2022Updated 3 years ago
- CodeQUEST is a generalizable framework which leverages LLMs to iteratively evaluate and enhance code quality across multiple dimensions f…☆16Updated this week
- opentqa is a open framework of the textbook question answering, which includes xtqa, mcan, cmr, mfb, mutan.☆11Mar 27, 2021Updated 4 years ago
- ☆11Nov 5, 2024Updated last year
- Transformer + GAT for RNA chemical reactivity prediction| Stanford Ribonanza☆11Jan 28, 2026Updated 2 weeks ago
- Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.☆12Feb 11, 2024Updated 2 years ago
- An LLM inference engine, written in C++☆18Feb 5, 2026Updated last week
- Long Context Research☆26Jan 26, 2026Updated 3 weeks ago
- ☆11Jan 3, 2024Updated 2 years ago
- ☆15May 26, 2025Updated 8 months ago
- Code for the paper "FinRLlama: A Solution to LLM-Engineered Signals Challenge at FinRL Contest 2024"☆13Feb 14, 2025Updated last year