Joshua-Ren/Learning_dynamics_LLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Joshua-Ren/Learning_dynamics_LLM)

Joshua-Ren / Learning_dynamics_LLM

☆225

Alternatives and similar repositories for Learning_dynamics_LLM

Users that are interested in Learning_dynamics_LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

jianghoucheng / AlphaEdit
View on GitHub
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models, ICLR 2025 (Outstanding Paper)
☆453Oct 15, 2025Updated 9 months ago
ablghtianyi / ICL_Modular_Arithmetic
View on GitHub
☆19Mar 25, 2025Updated last year
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆461Mar 20, 2026Updated 4 months ago
czp16 / Bridge-LLM-reasoning
View on GitHub
Behavior Injection: Preparing Language Models for Reinforcement Learning (NeurIPS 2025)
☆17Jul 1, 2025Updated last year
princeton-nlp / LESS
View on GitHub
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuning
☆532Oct 20, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yu-rp / NeuralLineage
View on GitHub
Code for CVPR 2024 Oral "Neural Lineage"
☆17Jun 18, 2024Updated 2 years ago
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated 11 months ago
liutianlin0121 / decoding-time-realignment
View on GitHub
Implementation of "Decoding-time Realignment of Language Models", ICML 2024.
☆21Jun 17, 2024Updated 2 years ago
lasgroup / SDPO
View on GitHub
Reinforcement Learning via Self-Distillation (SDPO)
☆1,027Jul 1, 2026Updated 3 weeks ago
hkust-nlp / simpleRL-reason
View on GitHub
Simple RL training for reasoning
☆3,871Dec 23, 2025Updated 7 months ago
locuslab / massive-activations
View on GitHub
Code accompanying the paper "Massive Activations in Large Language Models"
☆202Mar 4, 2024Updated 2 years ago
VainF / Remix-DiT
View on GitHub
☆18Dec 11, 2024Updated last year
poloclub / llm-landscape
View on GitHub
NeurIPS'24 - LLM Safety Landscape
☆40Oct 21, 2025Updated 9 months ago
facebookresearch / iGSM
View on GitHub
The code for creating the iGSM datasets in papers "Physics of Language Models Part 2.1, Grade-School Math and the Hidden Reasoning Proces…
☆88Jan 12, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
LiuAmber / RAHF
View on GitHub
[ACL 2024 main] Aligning Large Language Models with Human Preferences through Representation Engineering (https://aclanthology.org/2024.…
☆28Sep 25, 2024Updated last year
verl-project / verl
View on GitHub
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
☆22,699Updated this week
Huage001 / StyDeSty
View on GitHub
PyTorch implementation of paper "StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization" in ICML 2024.
☆16Jun 4, 2024Updated 2 years ago
StigLidu / CodeGym
View on GitHub
[ICLR2026] The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"
☆35Oct 14, 2025Updated 9 months ago
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
PRIME-RL / Entropy-Mechanism-of-RL
View on GitHub
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
☆446Jul 11, 2025Updated last year
ZhentingWang / DUMP
View on GitHub
☆33May 9, 2025Updated last year
AngelaZZZ-611 / reasoning_models_probing
View on GitHub
☆22May 14, 2026Updated 2 months ago
runopti / SpatialEvalLLM
View on GitHub
Code for "Evaluating Spatial Understanding of Large Language Models" TMLR 2024.
☆16Feb 22, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
RLHFlow / Minimal-RL
View on GitHub
☆275May 14, 2025Updated last year
y0mingzhang / diffuse-distributions
View on GitHub
Forcing Diffuse Distributions out of Language Models
☆18Sep 10, 2024Updated last year
mlwu22 / RED
View on GitHub
Implementation code for ACL2024：Advancing Parameter Efficiency in Fine-tuning via Representation Editing
☆15Apr 20, 2024Updated 2 years ago
GraySwanAI / circuit-breakers
View on GitHub
Improving Alignment and Robustness with Circuit Breakers
☆266Sep 24, 2024Updated last year
ZBox1005 / CoT-UQ
View on GitHub
[ACL 2025] "CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought"
☆17Apr 3, 2025Updated last year
princetonvisualai / icons
View on GitHub
☆22Apr 24, 2025Updated last year
Interplay-LM-Reasoning / Interplay-LM-Reasoning
View on GitHub
[ICML 2026 Spotlight] On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
☆164Jun 8, 2026Updated last month
florinshen / PlaneDreamer
View on GitHub
DreamGaussian with 2D-GS
☆12Oct 10, 2024Updated last year
hkust-nlp / dart-math
View on GitHub
[NeurIPS'24] Official code for *🎯DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*
☆120Dec 10, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
facebookresearch / coconut
View on GitHub
Training Large Language Model to Reason in a Continuous Latent Space
☆1,667Jul 2, 2026Updated 3 weeks ago
wenzhe-li / Self-MoA
View on GitHub
☆17Feb 4, 2025Updated last year
BunsenFeng / AbstainQA
View on GitHub
AbstainQA, ACL 2024
☆29Feb 4, 2026Updated 5 months ago
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,103Apr 15, 2026Updated 3 months ago
BytedTsinghua-SIA / DAPO
View on GitHub
An Open-source RL System from ByteDance Seed and Tsinghua AIR
☆1,849May 11, 2025Updated last year
gl-ybnbxb / BoNBoN
View on GitHub
☆19Jun 3, 2024Updated 2 years ago
OpenRLHF / OpenRLHF
View on GitHub
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Asy…
☆9,855Jul 14, 2026Updated 2 weeks ago