The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"
☆29Updated this week
Alternatives and similar repositories for CodeScaler
Users that are interested in CodeScaler are comparing it to the libraries listed below
Sorting:
- Code repo for FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs.☆32Nov 5, 2025Updated 3 months ago
- [NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective☆42Sep 18, 2025Updated 5 months ago
- ☆10Nov 8, 2022Updated 3 years ago
- vTPM with SGX protection☆11May 30, 2019Updated 6 years ago
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆30Updated this week
- ☆10Nov 1, 2022Updated 3 years ago
- unsloth-5090-multiple☆60May 21, 2025Updated 9 months ago
- A collection of heat engines, based on the OpenAI Gym environment framework for use with reinforcement learning applications.☆15Dec 20, 2021Updated 4 years ago
- ☆14Mar 21, 2024Updated last year
- ☆22Jan 25, 2026Updated last month
- Teaching a humanoid to walk(ish), then displaying in your browser (using tensorflow.js and reinforcement learning)☆10Sep 7, 2020Updated 5 years ago
- [ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding☆30Jan 27, 2026Updated last month
- About Code release for "Imagination Mechanism: Mesh Information Propagation for Enhancing Data Efficiency in Reinforcement Learning"☆13Oct 7, 2023Updated 2 years ago
- The official implementations of Noise-Informed Diffusion-Generated Image Detection With Anomaly Attention (TIFS 2025)☆18Jun 23, 2025Updated 8 months ago
- The code for the paper "A Bayesian Approach to Online Planning" published in ICML 2024.☆13Jun 17, 2024Updated last year
- code for polite☆11Feb 28, 2024Updated 2 years ago
- ☆16Feb 22, 2025Updated last year
- DreamSmooth: Improving Model-Based RL with Reward Smoothing (ICLR 2024)☆12May 6, 2024Updated last year
- ☆11Jan 11, 2022Updated 4 years ago
- Isaac Gym Reinforcement Learning Environments for humanoid robot Bez☆10Jul 27, 2022Updated 3 years ago
- libtpms / swtpm software emulation of a Trusted Platform Module (TPM 1.2 and TPM 2.0) compile script☆13Sep 16, 2020Updated 5 years ago
- ☆13May 3, 2024Updated last year
- Neural Networks for penetration testing. Part of active research.☆13Jun 21, 2022Updated 3 years ago
- Code for the arxiv paper: Complex Claim Verification with Evidence Retrieved in the Wild☆13Nov 27, 2023Updated 2 years ago
- ☆18Dec 9, 2025Updated 2 months ago
- Finding Camouflaged Needle in a Haystack? Pornographic Products Detection via Berrypicking Tree Model☆10Jul 29, 2019Updated 6 years ago
- ☆11May 29, 2025Updated 9 months ago
- Агрегированный проект методов искусственного интеллекта и машинного обучения☆11Oct 16, 2017Updated 8 years ago
- Code for the paper "FinRLlama: A Solution to LLM-Engineered Signals Challenge at FinRL Contest 2024"☆13Feb 14, 2025Updated last year
- Neural network sequence labeling model - some sloppy modifications to the original toolkit to enable punctuation restoration in unsegment…☆10Jan 8, 2017Updated 9 years ago
- DEPRECATED: The CoreOS logging package☆11Jun 22, 2022Updated 3 years ago
- My Very Own Deep Multiple Layered Echo State Network☆13Jan 2, 2021Updated 5 years ago
- Official Code Repo for the paper "Learning to Play Atari in a World of Tokens" accepted at ICML, 2024☆11Jun 6, 2024Updated last year
- The official code release for "Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning", ICLR 2025☆13May 28, 2025Updated 9 months ago
- Submission Under Review☆17May 15, 2025Updated 9 months ago
- Recursive Self-Aggregation evals on ARC-AGI☆28Jan 26, 2026Updated last month
- Concise Reasoning via Reinforcement Learning☆13Apr 16, 2025Updated 10 months ago
- paper on dexpilot☆15Oct 14, 2019Updated 6 years ago
- unifloc on python☆15Nov 14, 2020Updated 5 years ago