satrams/rent-rl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/satrams/rent-rl)

satrams / rent-rl

RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.

☆42

Alternatives and similar repositories for rent-rl

Users that are interested in rent-rl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

THU-KEG / LRM-FactEval
View on GitHub
☆17Jun 25, 2025Updated last year
sunblaze-ucb / Intuitor
View on GitHub
[ICLR 2026] Learning to Reason without External Rewards
☆418Jan 26, 2026Updated 5 months ago
Infini-AI-Lab / GRESO
View on GitHub
☆81Jun 8, 2026Updated last month
violetxi / ExpRL
View on GitHub
☆22Jun 16, 2026Updated last month
zjunlp / KnowRL
View on GitHub
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
☆48May 19, 2026Updated 2 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
AlphaLab-USTC / LRM-plans-CoT
View on GitHub
[NeurIPS 2025] The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"
☆31Jul 6, 2025Updated last year
armingh2000 / FactScoreLite
View on GitHub
FactScoreLite is an implementation of the FactScore metric, designed for detailed accuracy assessment in text generation. This package bu…
☆14Apr 25, 2024Updated 2 years ago
Line-Kite / GraphLayoutLM
View on GitHub
☆14Sep 6, 2024Updated last year
sail-sg / VeriFree
View on GitHub
Reinforcing General Reasoning without Verifiers
☆102Jun 24, 2025Updated last year
chenjianhuii / Mechanistic-Data-Attribution
View on GitHub
☆16May 25, 2026Updated 2 months ago
zwhong714 / Hybrid-Policy-Distillation
View on GitHub
[ICML 2026] Hybrid Policy Distillation (HPD) is a practical distillation framework for reasoning-oriented language models. This repositor…
☆24Apr 24, 2026Updated 3 months ago
fjzzq2002 / WeightWatch
View on GitHub
Official Repository of Paper "Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs"
☆15Sep 25, 2025Updated 10 months ago
liziniu / cold_start_rl
View on GitHub
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
☆20Mar 9, 2025Updated last year
ellaneeman / disent_qa
View on GitHub
This code accompanies the paper DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering.
☆16Mar 20, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
MetricsDI / DIMetrics
View on GitHub
☆10May 25, 2022Updated 4 years ago
ylsung / rsq
View on GitHub
Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"
☆23Mar 25, 2026Updated 4 months ago
tor4z / awesome-confidence-calibration
View on GitHub
awesome confidence calibration paper list
☆25Oct 21, 2021Updated 4 years ago
ibisbill / Transferability-of-LLM-Reasoning
View on GitHub
☆111Jul 6, 2026Updated 2 weeks ago
xiwenc1 / DRA-GRPO
View on GitHub
Official code for the paper: DRA-GRPO: Exploring Diversity-Aware Reward Adjustment for R1-Zero-Like Training of Large Language Models
☆24Jan 6, 2026Updated 6 months ago
jwhj / OREO
View on GitHub
☆116Jan 21, 2025Updated last year
goombalab / Gather-and-Aggregate
View on GitHub
Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
☆16Apr 30, 2025Updated last year
test-time-interaction / TTI
View on GitHub
☆76Jun 10, 2025Updated last year
AmourWaltz / Awesome-Reliable-LLM
View on GitHub
☆193Mar 8, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TianHongZXY / RLVR-Decomposed
View on GitHub
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆166Mar 2, 2026Updated 4 months ago
janphilippfranken / sami
View on GitHub
Self-Supervised Alignment with Mutual Information
☆20May 24, 2024Updated 2 years ago
yueyu1030 / actune
View on GitHub
[NAACL 2022] This is the code repo for our paper `ACTUNE: Uncertainty-based Active Self-Training for Active Fine-Tuning of Pretrained Lan…
☆15Nov 16, 2022Updated 3 years ago
Zanette-Labs / speed-rl
View on GitHub
☆18Feb 2, 2026Updated 5 months ago
IanYHWu / rc
View on GitHub
Public-facing codebase accompanying: "Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL"
☆36Feb 6, 2026Updated 5 months ago
NJUNLP / PATS
View on GitHub
☆46May 27, 2025Updated last year
HKUNLP / critic-rl
View on GitHub
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆127May 6, 2025Updated last year
Jamesding000 / MemGen-GR
View on GitHub
The code implementation for our KDD 2026 Oral paper "How Well Does Generative Recommendation Generalize?"
☆40Jun 2, 2026Updated last month
zorazrw / agent-skill-induction
View on GitHub
Agent Skill Induction: "Inducing Programmatic Skills for Agentic Tasks"
☆42Apr 24, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
EvanZhuang / mixinputs
View on GitHub
Official implementation for Text Generation Beyond Discrete Token Sampling
☆26Aug 11, 2025Updated 11 months ago
QingyangZhang / Label-Free-RLVR
View on GitHub
☆311Jul 6, 2025Updated last year
JoshEngels / SAE-Probes
View on GitHub
Code for reproducing our paper "Are Sparse Autoencoders Useful? A Case Study in Sparse Probing"
☆33Mar 31, 2025Updated last year
THUKElab / MixEdit
View on GitHub
The repository of EMNLP 2023 "MixEdit: Revisiting Data Augmentation and Beyond for Grammatical Error Correction"
☆12Nov 25, 2023Updated 2 years ago
William030422 / Video-Sycophancy
View on GitHub
Implementation for paper Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs, which is accepted by ACL 2026 (main con…
☆16Oct 10, 2025Updated 9 months ago
icip-cas / AutoAlign
View on GitHub
A toolkit for automated alignment research.
☆15Jul 3, 2026Updated 3 weeks ago
ZhangXJ199 / EDGE-GRPO
View on GitHub
Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity
☆22Aug 28, 2025Updated 10 months ago