HKUNLP/critic-rl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HKUNLP/critic-rl)

HKUNLP / critic-rl

[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning

☆126

Alternatives and similar repositories for critic-rl

Users that are interested in critic-rl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kiaia / GIRAFFE
View on GitHub
Extending context length of visual language models
☆12Dec 18, 2024Updated last year
zhuzilin / vllm-group
View on GitHub
☆12Nov 5, 2024Updated last year
koalazf99 / nanoverl
View on GitHub
Collections of RLxLM experiments using minimal codes
☆14Feb 17, 2025Updated last year
HKUNLP / STRING
View on GitHub
[ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
☆82Nov 25, 2024Updated last year
ganler / code-r1
View on GitHub
Reproducing R1 for Code with Reliable Rewards
☆313May 5, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hkust-nlp / B-STaR
View on GitHub
B-STAR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
☆86May 21, 2025Updated last year
HKUNLP / diffusion-of-thoughts
View on GitHub
[NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"
☆213Mar 4, 2025Updated last year
RLHFlow / Self-rewarding-reasoning-LLM
View on GitHub
Recipes to train the self-rewarding reasoning LLMs.
☆231Mar 2, 2025Updated last year
MasterVito / SwS
View on GitHub
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
☆41Nov 11, 2025Updated 8 months ago
Shark-NLP / EVALM
View on GitHub
Official codebase for “In-Context Learning with Many Demonstration Examples”
☆16Feb 13, 2023Updated 3 years ago
koalazf99 / Awesome-DataCentric-LLM
View on GitHub
Trending projects & awesome papers about data-centric llm studies.
☆40May 20, 2025Updated last year
zhaoxlpku / SubgoalXL
View on GitHub
☆26Aug 23, 2024Updated last year
chang-github-00 / LLM-Predictive-Decoding
View on GitHub
☆16Jul 9, 2025Updated last year
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated 11 months ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
zjunlp / KnowRL
View on GitHub
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
☆48May 19, 2026Updated 2 months ago
zhxieml / remiss-jailbreak
View on GitHub
☆33Jun 24, 2024Updated 2 years ago
hkust-nlp / mstar
View on GitHub
[ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning
☆75Jul 13, 2025Updated last year
zitian-gao / SC-MCTS
View on GitHub
Interpretable Contrastive Monte Carlo Tree Search Reasoning
☆52Nov 9, 2024Updated last year
SWE-Gym / SWE-Gym
View on GitHub
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym [ICML 2025]
☆708Jul 29, 2025Updated 11 months ago
eddycmu / demystify-long-cot
View on GitHub
☆336May 31, 2025Updated last year
PRIME-RL / PRIME
View on GitHub
Scalable RL solution for advanced reasoning of language models
☆1,865Mar 18, 2025Updated last year
chenllliang / ParetoMNMT
View on GitHub
Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023
☆17Sep 27, 2023Updated 2 years ago
GAIR-NLP / LIMR
View on GitHub
☆220Feb 20, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
RUCAIBox / Slow_Thinking_with_LLMs
View on GitHub
A series of technical report on Slow Thinking with LLM
☆767Aug 13, 2025Updated 11 months ago
TIGER-AI-Lab / CritiqueFineTuning
View on GitHub
Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]
☆182Jul 8, 2025Updated last year
all-the-noises / eval-arena
View on GitHub
☆34Mar 21, 2026Updated 4 months ago
Open-Reasoner-Zero / Open-Reasoner-Zero
View on GitHub
Official Repo for Open-Reasoner-Zero
☆2,096Jun 2, 2025Updated last year
byronBBL / Context-DPO
View on GitHub
Official repository of paper "Context-DPO: Aligning Language Models for Context-Faithfulness"
☆23Feb 17, 2025Updated last year
GAIR-NLP / weak-to-strong-reasoning
View on GitHub
☆59Sep 2, 2024Updated last year
hkust-nlp / simpleRL-reason
View on GitHub
Simple RL training for reasoning
☆3,868Dec 23, 2025Updated 6 months ago
hughbzhang / o1_inference_scaling_laws
View on GitHub
Replicating O1 inference-time scaling laws
☆94Dec 1, 2024Updated last year
hanningzhang / prm
View on GitHub
☆17Nov 3, 2024Updated last year
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
HKUNLP / ChunkLlama
View on GitHub
[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"
☆450Oct 16, 2024Updated last year
NineAbyss / S2R
View on GitHub
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆76Apr 22, 2025Updated last year
PRIME-RL / ImplicitPRM
View on GitHub
Repo of paper "Free Process Rewards without Process Labels"
☆172Mar 14, 2025Updated last year
DreamLM / DreamOn
View on GitHub
Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas
☆118Feb 3, 2026Updated 5 months ago
RUCAIBox / JiuZhang3.0
View on GitHub
The code and data for the paper JiuZhang3.0
☆49May 26, 2024Updated 2 years ago
uservan / ThinkPO
View on GitHub
☆17Aug 1, 2025Updated 11 months ago
Timothyxxx / KVCachePapers
View on GitHub
☆20May 24, 2024Updated 2 years ago