nick7nlp/FastCuRL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/nick7nlp/FastCuRL)

nick7nlp / FastCuRL

FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning (EMNLP 2025)

☆61

Alternatives and similar repositories for FastCuRL

Users that are interested in FastCuRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Linking-ai / SCOPE
View on GitHub
(ACL2025 oral) SCOPE: Optimizing KV Cache Compression in Long-context Generation
☆36May 28, 2025Updated last year
aeroplanepaper / GRPO-LEAD
View on GitHub
☆40Nov 18, 2025Updated 8 months ago
shuzhangzhong / HybriMoE-Preview
View on GitHub
☆17Apr 9, 2025Updated last year
UKPLab / arxiv2025-inherent-limits-plms
View on GitHub
Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…
☆14Jan 16, 2025Updated last year
Optimization-AI / DisCO
View on GitHub
NeurIPS 2025: Discriminative Constrained Optimization for Reinforcing Large Reasoning Models
☆53Mar 14, 2026Updated 4 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Zanette-Labs / SpeculativeRejection
View on GitHub
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
☆56Oct 29, 2024Updated last year
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated 2 years ago
psunlpgroup / FoVer
View on GitHub
This repository includes code and materials for the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findi…
☆18Apr 7, 2026Updated 3 months ago
junfeng0288 / MathReal
View on GitHub
☆15Aug 11, 2025Updated 11 months ago
mukhal / ThinkPRM
View on GitHub
[TMLR] Process Reward Models That Think
☆89Nov 29, 2025Updated 7 months ago
lblankl / Short-RL
View on GitHub
Short RL
☆19Apr 16, 2026Updated 3 months ago
zwhe99 / DeepMath
View on GitHub
A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning
☆294Sep 25, 2025Updated 9 months ago
nishadsinghi / sc-genrm-scaling
View on GitHub
[COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…
☆15Oct 31, 2025Updated 8 months ago
mutonix / pyramidinfer
View on GitHub
☆47Nov 25, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Trae1ounG / BuPO
View on GitHub
[arxiv: 2512.19673] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies
☆60Feb 6, 2026Updated 5 months ago
Tim-Siu / reinforcement-distillation
View on GitHub
Code repo for "Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning"
☆33Jul 25, 2025Updated 11 months ago
sail-sg / ActivePRM
View on GitHub
☆21Apr 16, 2025Updated last year
foreverlasting1202 / QuestA
View on GitHub
☆22Jan 2, 2026Updated 6 months ago
GAIR-NLP / LIMOPro
View on GitHub
☆15May 27, 2025Updated last year
Hritikbansal / sparse_feedback
View on GitHub
☆29Jan 23, 2024Updated 2 years ago
jinzhuoran / RAG-RewardBench
View on GitHub
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
☆18Dec 19, 2024Updated last year
RyanLiu112 / GenPRM
View on GitHub
[AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆102Nov 8, 2025Updated 8 months ago
GeniusHTX / TALE
View on GitHub
☆151Sep 12, 2025Updated 10 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
Kwai-Klear / RLEP
View on GitHub
RL with Experience Replay
☆59Jul 27, 2025Updated 11 months ago
1KE-JI / UPFT
View on GitHub
Official resources of "The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reaso…
☆20Jun 13, 2025Updated last year
shiweijiezero / R3L
View on GitHub
☆23Apr 5, 2026Updated 3 months ago
ShadeCloak / ADORA
View on GitHub
☆47Apr 9, 2025Updated last year
martenlienen / bsi
View on GitHub
Generative Modeling with Bayesian Sample Inference
☆24May 17, 2025Updated last year
lzhxmu / CPPO
View on GitHub
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)
☆181Nov 4, 2025Updated 8 months ago
Infini-AI-Lab / M2PO
View on GitHub
☆32Oct 8, 2025Updated 9 months ago
WooooDyy / LLM-Reverse-Curriculum-RL
View on GitHub
Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" pr…
☆116Feb 9, 2024Updated 2 years ago
sail-sg / Rigging-ChatbotArena
View on GitHub
Improving Your Model Ranking on Chatbot Arena by Vote Rigging (ICML 2025)
☆27Feb 25, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
sunjie279 / SimCT-
View on GitHub
☆21May 22, 2026Updated 2 months ago
AMAP-ML / GPG
View on GitHub
[ICLR26]GPG: A Simple and Strong Reinforcement Learning Baseline for Model Reasoning
☆179Jan 29, 2026Updated 5 months ago
ZBox1005 / CoT-UQ
View on GitHub
[ACL 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"
☆17Apr 3, 2025Updated last year
JIA-Lab-research / Step-DPO
View on GitHub
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
☆398Jan 19, 2025Updated last year
TergelMunkhbat / concise-reasoning
View on GitHub
Code for paper called Self-Training Elicits Concise Reasoning in Large Language Models
☆44Apr 22, 2025Updated last year
hustvl / MaTVLM
View on GitHub
☆62May 13, 2025Updated last year
tianyi-lab / C3PO
View on GitHub
[COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"
☆21Apr 9, 2025Updated last year