sunblaze-ucb/rl-grok-recipe

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sunblaze-ucb/rl-grok-recipe)

sunblaze-ucb / rl-grok-recipe

Code repository for "RL Grokking Recipe: How RL Unlocks and Transfers New Algorithms in LLMs""

☆35

Alternatives and similar repositories for rl-grok-recipe

Users that are interested in rl-grok-recipe are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

rdi-berkeley / awesome-RLVR-boundary
View on GitHub
A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Langu…
☆89Dec 12, 2025Updated 7 months ago
bethgelab / delta-belief-rl
View on GitHub
Official implementation of the ΔBelief-RL method.
☆31Feb 28, 2026Updated 5 months ago
dtch1997 / steering-bench
View on GitHub
Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"
☆22Dec 14, 2024Updated last year
neulab / SWE-Playground
View on GitHub
Official Repository for "Training Versatile Coding Agents in Synthetic Environments"
☆22Jan 11, 2026Updated 6 months ago
OPTML-Group / Unlearn-Smooth
View on GitHub
[ICML25] Official repo for "Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond…
☆24Sep 27, 2025Updated 10 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
runchu-tian / LongPiBench
View on GitHub
The repository for papaer "Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs"
☆14Dec 16, 2024Updated last year
VITA-Group / Junk_DNA_Hypothesis
View on GitHub
[ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…
☆16Apr 21, 2025Updated last year
mandyyyyii / east
View on GitHub
☆19Aug 4, 2025Updated 11 months ago
rycolab / kl-rb
View on GitHub
This repository contains code for the paper "Better Estimation of the KL Divergence Between Language Models"
☆19May 30, 2025Updated last year
RUCAIBox / SWE-World
View on GitHub
☆49Mar 6, 2026Updated 4 months ago
tmlr-group / G-effect
View on GitHub
[ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"
☆16Feb 27, 2025Updated last year
answers111 / alpha-research
View on GitHub
Repo for "AlphaResearch: Accelerating New Algorithm Discovery with Language Models"
☆58Nov 12, 2025Updated 8 months ago
willccbb / localchat
View on GitHub
☆13Apr 16, 2025Updated last year
brucewlee / self-incrimination
View on GitHub
Code used for "Training Agents to Self-Report Misbehavior"
☆18Feb 27, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
EvanZhuang / knowledge_flow
View on GitHub
Official Implementation of Knowledge Flow Prompting
☆35Oct 20, 2025Updated 9 months ago
ml-postech / selective-generation
View on GitHub
☆11Dec 8, 2024Updated last year
jkatzsam / woods_ood
View on GitHub
☆16May 25, 2022Updated 4 years ago
zaydzuhri / flame
View on GitHub
Fork of Flame repo for training of some new stuff in development
☆20Jul 15, 2026Updated 2 weeks ago
abdelfattah-lab / smcsd
View on GitHub
Sequential Monte Carlo Speculative Decoding
☆52Updated this week
horizon-llm / OpenKimi
View on GitHub
[ICML2026] Reproduce Kimi K1.5/K2 RL algorithm and rollout system
☆19Apr 9, 2026Updated 3 months ago
LYang-666 / TRGP
View on GitHub
[ICLR 2022] Official Code Repository for "TRGP: TRUST REGION GRADIENT PROJECTION FOR CONTINUAL LEARNING"
☆22Oct 5, 2022Updated 3 years ago
jthickstun / watermark
View on GitHub
Code for watermarking language models
☆88Sep 7, 2024Updated last year
slavachalnev / SAE-TS
View on GitHub
Improving Steering Vectors by Targeting Sparse Autoencoder Features
☆29Nov 20, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
GONGSHUKAI / USPTO_LLM
View on GitHub
[WWW 25] USPTO-LLM: A Large Language Model-Assisted Information-enriched Chemical Reaction Dataset
☆19Dec 12, 2024Updated last year
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
seamoke / DPH-RL
View on GitHub
This is the official implementation of paper "The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement…
☆20Feb 10, 2026Updated 5 months ago
genlm / genlm-backend
View on GitHub
High-performance backend for language model probabilistic programs
☆17Updated this week
aypan17 / latentqa
View on GitHub
☆34Nov 16, 2025Updated 8 months ago
ordavid-s / snmf-mlp-decomposition
View on GitHub
☆16Jul 7, 2026Updated 3 weeks ago
tilde-research / nitrobrew-release
View on GitHub
Fused KL divergence from hidden states for knowledge distillation
☆20Apr 28, 2026Updated 3 months ago
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
michaelbzhu / lora-without-regret
View on GitHub
☆47Oct 23, 2025Updated 9 months ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
facebookresearch / dualformer
View on GitHub
implementation of dualformer
☆25Mar 1, 2025Updated last year
noanabeshima / matryoshka-saes
View on GitHub
☆33Nov 28, 2024Updated last year
jaechan-repo / muse_bench
View on GitHub
☆33Aug 9, 2024Updated last year
goodfire-ai / memorization_kfac
View on GitHub
☆29Nov 6, 2025Updated 8 months ago
Interplay-LM-Reasoning / Interplay-LM-Reasoning
View on GitHub
[ICML 2026 Spotlight] On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
☆164Jun 8, 2026Updated last month
alex-damian / EOS
View on GitHub
☆15Sep 29, 2022Updated 3 years ago
maszhongming / ReactionMiner
View on GitHub
Repository for the EMNLP 2023 Demo Paper "Reaction Miner: An Integrated System for Chemical Reaction Extraction from Textual Data"
☆19Jan 27, 2025Updated last year