richardodliu / OpenCodeEvalLinks
☆48Updated 4 months ago
Alternatives and similar repositories for OpenCodeEval
Users that are interested in OpenCodeEval are comparing it to the libraries listed below
Sorting:
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆52Updated last year
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆241Updated 3 months ago
- [NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*☆119Updated last year
- Async pipelined version of Verl☆125Updated 8 months ago
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation☆33Updated 2 months ago
- Repository of LV-Eval Benchmark☆72Updated last year
- Reproducing R1 for Code with Reliable Rewards☆278Updated 7 months ago
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆110Updated 10 months ago
- ☆124Updated 6 months ago
- Evaluation utilities based on SymPy.☆21Updated last year
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆180Updated 7 months ago
- Implementation of NAACL 2024 Outstanding Paper "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆152Updated 9 months ago
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆101Updated 2 months ago
- Official Implementation of SAM-Decoding: Speculative Decoding via Suffix Automaton☆39Updated 10 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆118Updated 7 months ago
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models☆193Updated last year
- Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (EMNLP 2024 main)☆112Updated 9 months ago
- ☆48Updated 7 months ago
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆18Updated 7 months ago
- Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"☆47Updated 4 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆167Updated last year
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆107Updated 2 months ago
- ☆20Updated 2 months ago
- CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings☆58Updated 10 months ago
- The official repository of the Omni-MATH benchmark.☆88Updated 11 months ago
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.☆14Updated last year
- GenRM-CoT: Data release for verification rationales☆66Updated last year
- ☆78Updated 9 months ago
- ☆76Updated last year
- "Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding" Zhenyu Zhang, Runjin Chen, Shiw…☆31Updated last year