MasterVito/SvS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MasterVito/SvS)

MasterVito / SvS

Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training

☆54

Alternatives and similar repositories for SvS

Users that are interested in SvS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MasterVito / DAC-RL
View on GitHub
Official Repo for DAC-RL: Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
☆16Feb 26, 2026Updated 4 months ago
yangzhch6 / DARS
View on GitHub
The official implemention of "Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration" (ICML 2026)
☆24Feb 4, 2026Updated 5 months ago
MasterVito / SwS
View on GitHub
Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning
☆42Nov 11, 2025Updated 8 months ago
NJUNLP / AdaR
View on GitHub
☆15Dec 8, 2025Updated 7 months ago
UCSB-NLP-Chang / ThinkPrune
View on GitHub
☆46Sep 27, 2025Updated 9 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
zzli2022 / TLDR
View on GitHub
Code for Research Project TLDR
☆26Jul 28, 2025Updated 11 months ago
OpenIXCLab / CODA
View on GitHub
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning
☆37Aug 28, 2025Updated 10 months ago
JingMog / THOR
View on GitHub
[ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".
☆33Feb 26, 2026Updated 4 months ago
LARK-AI-Lab / CodeScaler
View on GitHub
The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"
☆35Mar 26, 2026Updated 3 months ago
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆459Mar 20, 2026Updated 4 months ago
sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
wizard-III / ArcherCodeR
View on GitHub
ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …
☆44Aug 6, 2025Updated 11 months ago
Zcchill / Value-Residual-Learning
View on GitHub
☆15Mar 20, 2025Updated last year
yayayacc / MUR
View on GitHub
☆49May 14, 2026Updated 2 months ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
chen-judge / UniGeo
View on GitHub
[EMNLP 22] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
☆34Dec 7, 2022Updated 3 years ago
uw-nsl / TinyV
View on GitHub
Your efficient and accurate answer verification system for RL training.
☆42Jun 23, 2025Updated last year
rookie-joe / AutoPSV
View on GitHub
☆50Oct 28, 2024Updated last year
ibisbill / Transferability-of-LLM-Reasoning
View on GitHub
☆111Jul 6, 2026Updated 2 weeks ago
kokolerk / TON
View on GitHub
[NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
☆58Sep 29, 2025Updated 9 months ago
LightChen233 / reasoning-boundary
View on GitHub
☆71Jun 18, 2025Updated last year
bigai-nlco / LatentSeek
View on GitHub
Official Repository of LatentSeek
☆85Jun 6, 2025Updated last year
THUDM / TreeRL
View on GitHub
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆97Jun 16, 2025Updated last year
1229095296 / ResRL
View on GitHub
This repository includes code for our paper: ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning…
☆15May 2, 2026Updated 2 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
THU-KEG / PairJudgeRM
View on GitHub
☆15Apr 14, 2025Updated last year
yjywdzh / ACE
View on GitHub
This repository refers to the codes of paper ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall
☆15Jan 31, 2026Updated 5 months ago
AI4fun / DQ-LoRe
View on GitHub
☆13Jun 26, 2024Updated 2 years ago
sail-sg / variational-reasoning
View on GitHub
Code for "Variational Reasoning for Language Models"
☆60Sep 29, 2025Updated 9 months ago
intervention-training / int
View on GitHub
☆16Feb 4, 2026Updated 5 months ago
CSSLab / ThinkTwice
View on GitHub
Jointly Optimizing Large Language Models for Reasoning and Self-Refinement
☆15Apr 22, 2026Updated 3 months ago
YujunZhou / EVOL-RL
View on GitHub
Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).
☆51Mar 31, 2026Updated 3 months ago
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
RUCAIBox / Passk_Training
View on GitHub
The official repository of paper "Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models''
☆113Aug 15, 2025Updated 11 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
LINs-lab / ELICIT
View on GitHub
[ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability
☆14Mar 11, 2025Updated last year
GAIR-NLP / OctoThinker
View on GitHub
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
☆189Jul 23, 2025Updated last year
facebookresearch / reasoning-memory
View on GitHub
Procedural Knowledge at Scale Improves ReasoningThis repository contains the minimal, end-to-end pipeline for reproducing the paper resul…
☆15Apr 1, 2026Updated 3 months ago
xufangzhi / ENVISIONS
View on GitHub
[ACL 2025] A Neural-Symbolic Self-Training Framework
☆117Jun 1, 2025Updated last year
QwenQKing / Prompt-R1
View on GitHub
Prompt-R1: Collaborative Automatic Prompting Framework via End-to-end Reinforcement Learning
☆60Feb 24, 2026Updated 4 months ago
wizard-III / Archer2.0
View on GitHub
Archer2.0 evolves from its predecessor by introducing ASPO, which overcomes fundamental PPO-Clip limitations to prevent premature converg…
☆31Oct 10, 2025Updated 9 months ago
DoubtedSteam / DyVTE
View on GitHub
The official implement of "Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings"
☆18Dec 5, 2024Updated last year