BigComputer-Project/SWE-Arena

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/BigComputer-Project/SWE-Arena)

BigComputer-Project / SWE-Arena

SWE Arena

☆36

Alternatives and similar repositories for SWE-Arena

Users that are interested in SWE-Arena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

bigcode-project / bigcodearena
View on GitHub
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
☆61Oct 13, 2025Updated 9 months ago
bigcode-project / bigcodebench-annotation
View on GitHub
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
☆26Aug 8, 2024Updated last year
JoshuaPurtell / SmallBench
View on GitHub
Small, simple agent task environments for training and evaluation
☆20Nov 1, 2024Updated last year
arjunguha / BigCodeBench-X
View on GitHub
A benchmark of programming tasks for LLMs that supports almost any programming language.
☆13Jun 30, 2025Updated last year
JunlinHan / CropMix
View on GitHub
Code of CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping
☆17Oct 8, 2022Updated 3 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
chagmgang / pysc2_rl
View on GitHub
☆10Jul 14, 2018Updated 8 years ago
Narsil / whispering
View on GitHub
☆20Oct 5, 2025Updated 9 months ago
bigcode-project / bigcodebench
View on GitHub
[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI
☆515Jan 3, 2026Updated 6 months ago
CodeLLM-Research / CodeJudge-Eval
View on GitHub
[COLING25] CodeJudge Eval: Can Large Language Models be Good Judges in Code Understanding?
☆12Dec 3, 2024Updated last year
phonism / CP-Zero
View on GitHub
Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.
☆18Apr 22, 2025Updated last year
maitrix-org / dynamic-alignment-optimization
View on GitHub
[EMNLP'24 (Main)] DRPO(Dynamic Rewarding with Prompt Optimization) is a tuning-free approach for self-alignment. DRPO leverages a search-…
☆24Nov 17, 2024Updated last year
Hambaobao / SWE-Flow
View on GitHub
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner
☆39Jun 29, 2025Updated last year
jtonglet / Numerical-Hybrid-QA-Literature
View on GitHub
A list of Numerical Multimodal reasoning papers and their implementation
☆11May 13, 2024Updated 2 years ago
robintyh1 / neurips2021-meta-gradient-offpolicy-evaluation
View on GitHub
Code for Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation @ NeurIPS 2021
☆13Nov 3, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
shaman-ai / llambdao
View on GitHub
Large Language Agents Modulating Behaviour in Decentralized Autonomous Organizations
☆24Jul 14, 2023Updated 3 years ago
keeganhines / snowman
View on GitHub
☆12Jun 24, 2017Updated 9 years ago
JetBrains-Research / codegen-metrics
View on GitHub
Replication package for evaluation of code generation metrics
☆17Nov 24, 2025Updated 7 months ago
camel-ai / camel_chat
View on GitHub
💬 Minimalistic repository to reproduce and serve CAMEL models.
☆24Jun 26, 2023Updated 3 years ago
AxiomMath / dead-ends
View on GitHub
☆15Mar 25, 2026Updated 3 months ago
qingsongedu / ICML2022-FEDformer
View on GitHub
Source code of ICML'22 paper: FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
☆10Jun 10, 2022Updated 4 years ago
anikethjr / promoter_models
View on GitHub
Code to build models that effectively predict promoter-driven gene expression
☆12May 15, 2025Updated last year
flowvqa / flowvqa
View on GitHub
The official dataset of the flowvqa project.
☆24Mar 26, 2024Updated 2 years ago
zhongwanjun / CARP
View on GitHub
code for the table-based open domain question answering project, with paper title: "Reasoning over Hybrid Chain for Table-and-Text Open D…
☆12Sep 16, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
google-deepmind / lm_act
View on GitHub
LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations
☆30May 21, 2025Updated last year
chhayac / awesome-DGA
View on GitHub
Domain Generation Algorithms research papers, datasets and code
☆15May 17, 2020Updated 6 years ago
terryyz / ice-score
View on GitHub
[EACL 2024] ICE-Score: Instructing Large Language Models to Evaluate Code
☆79Jun 16, 2024Updated 2 years ago
BlinkDL / Agen
View on GitHub
Agen is a minimalist language for agent loops and state machines.
☆49Mar 30, 2026Updated 3 months ago
DSA-MLOPS / main
View on GitHub
☆18Apr 19, 2023Updated 3 years ago
mrbende / what-lives
View on GitHub
Source code for "What Lives? A meta-analysis of diverse opinions on the definition of life"
☆19Jul 6, 2026Updated 2 weeks ago
sail-sg / hloenv
View on GitHub
an environment based on XLA for deep learning compiler optimization research.
☆24Mar 7, 2023Updated 3 years ago
yilunzhao / RobuT
View on GitHub
Data and code for ACL 2023 paper "RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations"
☆15Feb 8, 2024Updated 2 years ago
Xnhyacinth / NesyCD
View on GitHub
[AAAI 2025] Neural-Symbolic Collaborative Distillation: Advancing Small Language Models for Complex Reasoning Tasks
☆12Jun 19, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
OpenHands / trajectory-visualizer
View on GitHub
☆47Jun 11, 2026Updated last month
microsoft / TestExplora
View on GitHub
This is an official code for the paper: TestExplora: Benchmarking LLMs for Proactive Bug Discovery via Repository-Level Test Generation
☆27Mar 26, 2026Updated 3 months ago
lezhang7 / MOQAGPT
View on GitHub
[EMNLP'2023 Findings] MoqaGPT, for zero-shot multimodal question answering with LLMs
☆13Dec 28, 2024Updated last year
OSU-NLP-Group / Pangu
View on GitHub
☆12Jul 10, 2023Updated 3 years ago
juyongjiang / VFedPCA-VFedAKPCA
View on GitHub
Source Code & Datasets for "Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data"
☆12May 20, 2022Updated 4 years ago
guxd / VariationalSeq2Seq
View on GitHub
A pytorch implementation of "Latent Variable Dialogue Models and their Diversity"
☆18Nov 30, 2017Updated 8 years ago
serp-ai / Parameter-Efficient-MoE
View on GitHub
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆31May 22, 2024Updated 2 years ago