bigcode-project/bigcodearena

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bigcode-project/bigcodearena)

bigcode-project / bigcodearena

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

☆61

Alternatives and similar repositories for bigcodearena

Users that are interested in bigcodearena are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

BigComputer-Project / SWE-Arena
View on GitHub
SWE Arena
☆36Jul 6, 2025Updated last year
bigcode-project / bigcodebench-annotation
View on GitHub
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
☆26Aug 8, 2024Updated last year
inclusionAI / GroveMoE
View on GitHub
☆24Aug 20, 2025Updated 11 months ago
SWE-Perf / SWE-Perf
View on GitHub
☆51Oct 28, 2025Updated 8 months ago
3B-Group / ConvRe
View on GitHub
🤖ConvRe🤯: An Investigation of LLMs’ Inefficacy in Understanding Converse Relations (EMNLP 2023)
☆24Oct 10, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
GavinZhengOI / LiveCodeBench-Pro
View on GitHub
☆176Dec 13, 2025Updated 7 months ago
ise-uiuc / blazedit
View on GitHub
Making code edting up to 7.7x faster using multi-layer speculation
☆23Feb 20, 2025Updated last year
AxiomMath / dead-ends
View on GitHub
☆15Mar 25, 2026Updated 3 months ago
inclusionAI / M2-Reasoning
View on GitHub
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning
☆47Jul 17, 2025Updated last year
bigcode-project / bigcodebench
View on GitHub
[ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI
☆515Jan 3, 2026Updated 6 months ago
richardodliu / OpenCodeEval
View on GitHub
☆52Mar 9, 2026Updated 4 months ago
inclusionAI / Ring-V2
View on GitHub
Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.
☆98Oct 23, 2025Updated 8 months ago
timxzz / VML_Examples
View on GitHub
Examples of Verbalized Machine Learning (VML)
☆16Mar 16, 2025Updated last year
InternLM / JanusCoder
View on GitHub
[ICLR 2026] JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence
☆78May 9, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
lfy79001 / S3Eval
View on GitHub
[NAACL 2024] A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models
☆33Jun 10, 2024Updated 2 years ago
MoonshotAI / WorldVQA
View on GitHub
☆119Feb 4, 2026Updated 5 months ago
RUCAIBox / JiuZhang3.0
View on GitHub
The code and data for the paper JiuZhang3.0
☆49May 26, 2024Updated 2 years ago
sail-sg / tty-use
View on GitHub
☆15Oct 13, 2025Updated 9 months ago
bigcode-project / astraios
View on GitHub
Astraios: Parameter-Efficient Instruction Tuning Code Language Models
☆63Apr 10, 2024Updated 2 years ago
R2E-Gym / R2E-Gym
View on GitHub
[COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents
☆307Jul 13, 2025Updated last year
multimodal-art-projection / CodeCriticBench
View on GitHub
☆16Nov 1, 2025Updated 8 months ago
THU-KEG / DeepPrune
View on GitHub
🌿 DeepPrune: Parallel Scaling without Inter-trace Redundancy
☆21Apr 20, 2026Updated 3 months ago
THU-KEG / COPEN
View on GitHub
The official code and dataset for EMNLP 2022 paper "COPEN: Probing Conceptual Knowledge in Pre-trained Language Models".
☆21Mar 9, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
facebookresearch / swe-rl
View on GitHub
[NeurIPS'25] Official codebase for "SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution"
☆712Mar 16, 2025Updated last year
multimodal-art-projection / CriticLean
View on GitHub
☆50Aug 5, 2025Updated 11 months ago
THU-KEG / LongWriter-V
View on GitHub
[ACM MM25] LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models
☆24Mar 29, 2025Updated last year
ganler / code-r1
View on GitHub
Reproducing R1 for Code with Reliable Rewards
☆313May 5, 2025Updated last year
mnluzimu / WebGen-Bench
View on GitHub
☆53Jul 10, 2026Updated last week
SWE-EVO / SWE-EVO
View on GitHub
☆53May 3, 2026Updated 2 months ago
CUHK-ARISE / CodeCrash
View on GitHub
[NeurIPS 2025] CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning
☆18Jan 24, 2026Updated 5 months ago
amazon-science / cceval
View on GitHub
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)
☆181Aug 15, 2025Updated 11 months ago
SWE-bench / swe-bench.github.io
View on GitHub
Landing page + leaderboard for SWE-Bench benchmark
☆15Mar 29, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
lupantech / ineqmath
View on GitHub
Solving Inequality Proofs with Large Language Models.
☆61Dec 15, 2025Updated 7 months ago
godmoves / reinforcement_learning_collections
View on GitHub
A collection of deep reinforcement learning algorithm implementations
☆11Jan 9, 2020Updated 6 years ago
ltzheng / SimpleTIR
View on GitHub
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆401Mar 30, 2026Updated 3 months ago
suu990901 / KlearReasoner
View on GitHub
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
☆82Dec 25, 2025Updated 6 months ago
DSA-MLOPS / DSAA6000I
View on GitHub
☆28Jun 9, 2024Updated 2 years ago
Websail-NU / CODAH
View on GitHub
Repository for the CODAH dataset
☆22Oct 29, 2022Updated 3 years ago
AxiomMath / lattice-triangle
View on GitHub
Lean formalizations for the paper "On the paucity of lattice triangles"
☆18Mar 26, 2026Updated 3 months ago