CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
☆75Feb 3, 2025Updated last year
Alternatives and similar repositories for CodeElo
Users that are interested in CodeElo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature converg…☆31Oct 10, 2025Updated 8 months ago
- ☆12Feb 11, 2026Updated 4 months ago
- Official repository for the paper "LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code"☆894Jul 16, 2025Updated 11 months ago
- Code and dataset for Polyglot Prompting: Multilingual Multitask Prompt Training.☆18Dec 7, 2022Updated 3 years ago
- ARI (Abstract Reasoning Induction) is an innovative framework designed to enhance the temporal reasoning capabilities of Large Language M…☆13Dec 29, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆15Jul 5, 2024Updated last year
- The evaluation code for MultiIF multi-turn and multi-lingual instruction following☆63Oct 29, 2024Updated last year
- Temporal Knowledge Graph Question Answering via Subgraph Reasoning☆16Mar 23, 2025Updated last year
- ☆24Oct 10, 2025Updated 8 months ago
- ☆72Oct 23, 2025Updated 8 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆57Mar 6, 2025Updated last year
- A tool to build a graph from a codebase☆14Feb 19, 2025Updated last year
- Evaluation of LLMs on latest math competitions☆268Jun 23, 2026Updated last week
- MULTITQ is a large-scale dataset featuring ample relevant facts and multiple temporal granularities.☆26Apr 28, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆11Aug 10, 2021Updated 4 years ago
- A stateless password management solution☆10Sep 11, 2018Updated 7 years ago
- [AAAI 2025] Augmenting Math Word Problems via Iterative Question Composing (https://arxiv.org/abs/2401.09003)☆23Oct 2, 2025Updated 8 months ago
- Competitive Programming Code Template☆10Nov 6, 2022Updated 3 years ago
- A lambda calculus parser, evaluator and repl☆11Oct 30, 2021Updated 4 years ago
- Reproducing R1 for Code with Reliable Rewards☆313May 5, 2025Updated last year
- A scrapy crawler that crawls problems and its best solutions on codeforces.com☆12Feb 25, 2016Updated 10 years ago
- ☆241Feb 28, 2026Updated 4 months ago
- Contains the model patches and the eval logs from the passing swe-bench-lite run.☆10Jun 28, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆62Apr 2, 2025Updated last year
- ☆43Mar 26, 2025Updated last year
- An esoteric programming language with just two data types: null and tape☆11Jan 31, 2024Updated 2 years ago
- Data mapping framework for rust stuff☆54Mar 25, 2026Updated 3 months ago
- 实现一个自己的小语言模型☆11Jun 15, 2024Updated 2 years ago
- Learning to route instances for Human vs AI Feedback (ACL Main '25)☆29Jul 23, 2025Updated 11 months ago
- ☆1,156Jan 10, 2026Updated 5 months ago
- Azure Command-Line Interface☆15Mar 26, 2026Updated 3 months ago
- Explore and Control with Adversarial Surprise☆10Jul 20, 2021Updated 4 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code for Bayesian inference for queueing networks with incomplete data☆12Jul 5, 2017Updated 8 years ago
- 🌏 Modular retrievers for zero-shot multilingual IR.☆30Mar 6, 2024Updated 2 years ago
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆81May 2, 2025Updated last year
- Chef cookbooks for managing a Ceph cluster☆12Apr 2, 2023Updated 3 years ago
- Code for paper 'Data-Efficient FineTuning'☆28May 24, 2023Updated 3 years ago
- Wantedlyのインターン情報や新卒採用についてのインフォメーションです☆11Apr 5, 2022Updated 4 years ago
- [ICML 2025🔥] ParallelComp: Parallel Long-Context Compressor for Length Extrapolation☆30Jun 16, 2025Updated last year