shangshang-wang/Tina

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/shangshang-wang/Tina)

shangshang-wang / Tina

[ICLR 2026] Tina: Tiny Reasoning Models via LoRA

☆338

Alternatives and similar repositories for Tina

Users that are interested in Tina are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sail-sg / VeriFree
View on GitHub
Reinforcing General Reasoning without Verifiers
☆102Jun 24, 2025Updated last year
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated 11 months ago
shangshang-wang / Tora
View on GitHub
Tora: Torchtune-LoRA for RL
☆87Dec 2, 2025Updated 7 months ago
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆228Nov 27, 2025Updated 7 months ago
sail-sg / understand-r1-zero
View on GitHub
Understanding R1-Zero-Like Training: A Critical Perspective
☆1,267Aug 27, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,100Apr 15, 2026Updated 3 months ago
MikaStars39 / PeRL
View on GitHub
PeRL: Parameter-Efficient Reinforcement Learning
☆82May 20, 2026Updated 2 months ago
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
RyanLiu112 / GenPRM
View on GitHub
[AAAI 2026] Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆102Nov 8, 2025Updated 8 months ago
sunblaze-ucb / Intuitor
View on GitHub
[ICLR 2026] Learning to Reason without External Rewards
☆417Jan 26, 2026Updated 5 months ago
brendanhogan / completion_tree_view
View on GitHub
☆15Apr 26, 2025Updated last year
sail-sg / oat
View on GitHub
🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
☆667Jan 29, 2026Updated 5 months ago
sail-sg / AnytimeReasoner
View on GitHub
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆54Jul 15, 2025Updated last year
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ServiceNow / PipelineRL
View on GitHub
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
☆427Updated this week
peijunallin / alphalora
View on GitHub
☆19Nov 10, 2024Updated last year
rllm-org / rllm
View on GitHub
Democratizing Reinforcement Learning for LLMs
☆5,708Updated this week
open-thought / reasoning-gym
View on GitHub
[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards
☆1,463Apr 17, 2026Updated 3 months ago
RUCAIBox / Slow_Thinking_with_LLMs
View on GitHub
A series of technical report on Slow Thinking with LLM
☆767Aug 13, 2025Updated 11 months ago
hkust-nlp / simpleRL-reason
View on GitHub
Simple RL training for reasoning
☆3,868Dec 23, 2025Updated 6 months ago
knoveleng / open-rs
View on GitHub
[AAAI 2026] - Official repo for paper: "Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't"
☆291Mar 11, 2026Updated 4 months ago
Parallel-Reasoning / APR
View on GitHub
[COLM 2025] Code for Paper: Learning Adaptive Parallel Reasoning with Language Models
☆144Dec 17, 2025Updated 7 months ago
shangshang-wang / Resa
View on GitHub
Resa: Transparent Reasoning Models via SAEs
☆50Sep 23, 2025Updated 9 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
OpenPipe / deductive-reasoning
View on GitHub
Train your own SOTA deductive reasoning model
☆111Mar 6, 2025Updated last year
microsoft / x-reasoner
View on GitHub
X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains
☆49Feb 4, 2026Updated 5 months ago
Linn3a / siren
View on GitHub
Official implementation of Selective Entropy Regularization (SIREN), proposed by paper 'Rethinking Entropy Regularization in Large Reason…
☆32Dec 10, 2025Updated 7 months ago
TIGER-AI-Lab / CritiqueFineTuning
View on GitHub
Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate" [COLM 2025]
☆182Jul 8, 2025Updated last year
Nardien / agent-distillation
View on GitHub
Official Code Repository for the paper "Distilling LLM Agent into Small Models with Retrieval and Code Tools"
☆250Oct 22, 2025Updated 8 months ago
YefanZhou / TempBalance
View on GitHub
[NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
☆37Apr 7, 2025Updated last year
THU-KEG / PairJudgeRM
View on GitHub
☆15Apr 14, 2025Updated last year
PeterGriffinJin / Search-R1
View on GitHub
Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL
☆5,130Nov 13, 2025Updated 8 months ago
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆459Mar 20, 2026Updated 4 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Open-Reasoner-Zero / Open-Reasoner-Zero
View on GitHub
Official Repo for Open-Reasoner-Zero
☆2,096Jun 2, 2025Updated last year
allenai / open-instruct
View on GitHub
AllenAI's post-training codebase
☆3,803Updated this week
bigai-nlco / RuleReasoner
View on GitHub
[ICLR 2026] RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling
☆39Feb 25, 2026Updated 4 months ago
brendanhogan / picoDeepResearch
View on GitHub
☆69May 23, 2025Updated last year
axon-rl / gem
View on GitHub
A Gym for Agentic LLMs
☆502Jan 21, 2026Updated 6 months ago
kurakurai / Luth
View on GitHub
Luth is a state-of-the-art series of fine-tuned LLMs for French
☆46Oct 12, 2025Updated 9 months ago
NovaSky-AI / SkyRL
View on GitHub
SkyRL: A Modular Full-stack RL Library for LLMs
☆2,085Updated this week