richardodliu / OpenCodeEvalLinks
β47Updated last month
Alternatives and similar repositories for OpenCodeEval
Users that are interested in OpenCodeEval are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejectionβ51Updated 11 months ago
- [NeurIPS'24] Official code for *π―DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*β115Updated 10 months ago
- Async pipelined version of Verlβ119Updated 6 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"β231Updated last month
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generationβ28Updated last week
- Reproducing R1 for Code with Reliable Rewardsβ259Updated 5 months ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"β174Updated 4 months ago
- Evaluation utilities based on SymPy.β20Updated 10 months ago
- Repository of LV-Eval Benchmarkβ70Updated last year
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learningβ114Updated 5 months ago
- β118Updated 4 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automatonβ32Updated 8 months ago
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)β110Updated 6 months ago
- CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratingsβ55Updated 8 months ago
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.β18Updated 5 months ago
- The code and data for the paper JiuZhang3.0β49Updated last year
- β43Updated 5 months ago
- The official repository of the Omni-MATH benchmark.β88Updated 9 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scalingβ177Updated 2 months ago
- General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]β185Updated 4 months ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verificationβ64Updated 3 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodingsβ163Updated last year
- β27Updated 2 weeks ago
- β65Updated 10 months ago
- Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"β46Updated 2 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolutionβ89Updated 3 weeks ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]β108Updated 7 months ago
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Modelsβ184Updated last year
- Implementation for FP8/INT8 Rollout for RL training without performence drop.β260Updated 2 weeks ago
- PoC for "SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning" [NeurIPS '25]β53Updated 2 weeks ago