tajwarfahim/srt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/tajwarfahim/srt)

tajwarfahim / srt

Official implementation for the paper "Can Large Reasoning Models Self-Train?"

☆76

Alternatives and similar repositories for srt

Users that are interested in srt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sunblaze-ucb / Intuitor
View on GitHub
[ICLR 2026] Learning to Reason without External Rewards
☆420Jan 26, 2026Updated 6 months ago
waltonfuture / MM-UPT
View on GitHub
[NeurIPS 2025] First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training
☆88Oct 29, 2025Updated 9 months ago
Levi-Ackman / LiNo
View on GitHub
Official implementation of paper: LiNo: Advancing Recursive Residual Decomposition of Linear and Nonlinear Patterns for Robust Time Serie…
☆18Dec 19, 2025Updated 7 months ago
PRIME-RL / TTRL
View on GitHub
[NeurIPS 2025] TTRL: Test-Time Reinforcement Learning
☆1,104Apr 15, 2026Updated 3 months ago
mbzuai-oryx / EvoLMM
View on GitHub
Self Evolving Large Multimodal Models with Continuous Rewards
☆25Jun 9, 2026Updated last month
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ypwang61 / One-Shot-RLVR
View on GitHub
[NeurIPS 2025] Reinforcement Learning for Reasoning in Large Language Models with One Training Example
☆444Mar 11, 2026Updated 4 months ago
zhaosnw / evo_mem
View on GitHub
☆18Dec 21, 2025Updated 7 months ago
waterhorse1 / Natural-language-RL
View on GitHub
Natural Language Reinforcement Learning
☆101Jul 30, 2025Updated 11 months ago
chentong0 / rl-binary-rar
View on GitHub
Official repo for "Binary Retrieval-augmented Reward Mitigates Hallucinations"
☆15Nov 13, 2025Updated 8 months ago
sail-sg / VeriFree
View on GitHub
Reinforcing General Reasoning without Verifiers
☆102Jun 24, 2025Updated last year
TianHongZXY / RLVR-Decomposed
View on GitHub
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆166Mar 2, 2026Updated 4 months ago
OmkarThawakar / Self-Learning-Robot
View on GitHub
Reinforcement Training of Robot
☆11Dec 1, 2019Updated 6 years ago
wangxu0820 / NegativePrompt
View on GitHub
The official GitHub page for paper "NegativePrompt: Leveraging Psychology for Large Language Models Enhancement via Negative Emotional St…
☆25May 10, 2024Updated 2 years ago
satori-reasoning / Satori-SWE
View on GitHub
☆21May 30, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OoDBag / VisTA
View on GitHub
VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection
☆27May 31, 2025Updated last year
CommissarSilver / CVT
View on GitHub
This repository contains the replication package of our paper "Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Rep…
☆10Nov 16, 2023Updated 2 years ago
stanfordnlp / multi-distribution-retrieval
View on GitHub
Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval
☆17Jan 16, 2024Updated 2 years ago
WEIRDLabUW / dispo
View on GitHub
Distributional Successor Features Enable Zero-Shot Policy Optimization
☆15Apr 11, 2025Updated last year
ChnQ / TracingLLM
View on GitHub
☆30May 22, 2024Updated 2 years ago
ruixin31 / Spurious_Rewards
View on GitHub
☆361Jul 29, 2025Updated last year
microsoft / MM-WebAgent
View on GitHub
Build coherent and visually polished multimodal webpages with hierarchical planning, AIGC tools, and iterative reflection.
☆15May 17, 2026Updated 2 months ago
MixLabPro / userbank
View on GitHub
☆15Jun 7, 2025Updated last year
wassname / prob_jsonformer
View on GitHub
Generate Structured JSON with probs from Language Models
☆17Mar 23, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
GaryStack / Trustworthy-Evaluation
View on GitHub
Repository of paper "Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis" (ACL 2025 Main)
☆19Jul 19, 2025Updated last year
samkaufman / morello
View on GitHub
☆17Jul 7, 2026Updated 3 weeks ago
testtimescaling / testtimescaling.github.io
View on GitHub
"what, how, where, and how well? a survey on test-time scaling in large language models" repository
☆110Jul 19, 2026Updated last week
keikeiqi / MGTTA
View on GitHub
AAAI2025
☆13Apr 18, 2025Updated last year
OpenBMB / RLPR
View on GitHub
Extrapolating RLVR to General Domains without Verifiers
☆205Aug 12, 2025Updated 11 months ago
wlzhang2020 / LLMTreeRec
View on GitHub
The implement of LLMTreeRec
☆14Dec 9, 2024Updated last year
suu990901 / KlearReasoner
View on GitHub
Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
☆82Dec 25, 2025Updated 7 months ago
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
Job-Bench / job-bench-eval
View on GitHub
Official eval scripts for JobBench
☆32Jul 18, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
aghaeifar / SpinWalk
View on GitHub
SpinWalk, a framework for Monte-Carlo simulation to model spins random walk within a network. SpinWalk paper:
☆14Mar 11, 2026Updated 4 months ago
kailas-v / human-ai-interactions
View on GitHub
☆11Oct 28, 2022Updated 3 years ago
SalesforceAIResearch / UserRL
View on GitHub
The raw UserRL repo under construction
☆114Jun 2, 2026Updated last month
HazyResearch / scaling-verification
View on GitHub
☆26Sep 4, 2025Updated 10 months ago
mustansarfiaz / PS-ARM
View on GitHub
Abstract. Person search is a challenging problem with various real- world applications, that aims at joint person detection and re-identi…
☆13Feb 28, 2024Updated 2 years ago
Time-Search / TimeSearch-R
View on GitHub
[ICLR 2026] Official code for paper: TimeSearch-R: Adaptive Temporal Search for Long-Form Video Understanding via Self-Verification Reinf…
☆27Jan 29, 2026Updated 6 months ago
plm-team / PLM
View on GitHub
PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing
☆21Mar 18, 2025Updated last year