MasterVito/SwS

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MasterVito/SwS)

MasterVito / SwS

Official Repo for SwS: A Weakness-driven Problem Synthesis Framework in RL for LLM Reasoning

☆42

Alternatives and similar repositories for SwS

Users that are interested in SwS are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MasterVito / SvS
View on GitHub
Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training
☆54Dec 13, 2025Updated 7 months ago
zhaoxlpku / PromptCoT
View on GitHub
☆17Apr 10, 2025Updated last year
BaohaoLiao / frac-cot
View on GitHub
[COLM 2026] An efficient 3D sampling method for long-CoT LLM.
☆16May 25, 2025Updated last year
zhangxy-2019 / critique-GRPO
View on GitHub
[ICML 2026 Spotlight] Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback
☆70Jun 3, 2026Updated last month
yangzhch6 / DARS
View on GitHub
The official implemention of "Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration" (ICML 2026)
☆24Feb 4, 2026Updated 5 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
psunlpgroup / FoVer
View on GitHub
This repository includes code and materials for the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findi…
☆18Apr 7, 2026Updated 3 months ago
LARK-AI-Lab / CodeScaler
View on GitHub
The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"
☆35Mar 26, 2026Updated 3 months ago
Zanette-Labs / speed-rl
View on GitHub
☆18Feb 2, 2026Updated 5 months ago
zzli2022 / TLDR
View on GitHub
Code for Research Project TLDR
☆26Jul 28, 2025Updated 11 months ago
HKUNLP / critic-rl
View on GitHub
[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning
☆127May 6, 2025Updated last year
THUDM / TreeRL
View on GitHub
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆97Jun 16, 2025Updated last year
yongchao98 / R1-Code-Interpreter
View on GitHub
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
☆44Feb 9, 2026Updated 5 months ago
YujunZhou / EVOL-RL
View on GitHub
Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).
☆51Mar 31, 2026Updated 3 months ago
Zcchill / Value-Residual-Learning
View on GitHub
☆15Mar 20, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
yayayacc / MUR
View on GitHub
☆49May 14, 2026Updated 2 months ago
StarDewXXX / AdaR1
View on GitHub
The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"
☆24May 6, 2026Updated 2 months ago
LARK-AI-Lab / EnvFactory
View on GitHub
The official paper for EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL.
☆85Jun 5, 2026Updated last month
liushulinle / UloRL
View on GitHub
An Ultra-Long Output Reinforcement Learning Approach
☆23Jul 31, 2025Updated 11 months ago
sunblaze-ucb / omega
View on GitHub
☆47Jun 24, 2025Updated last year
MasterVito / DAC-RL
View on GitHub
Official Repo for DAC-RL: Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability
☆16Feb 26, 2026Updated 4 months ago
Jiahao004 / DeepTheorem
View on GitHub
☆26Jun 10, 2025Updated last year
multimodal-art-projection / CodeCriticBench
View on GitHub
☆16Nov 1, 2025Updated 8 months ago
weiyifan1023 / AutoTIR
View on GitHub
Code and Data for Paper "AutoTIR: Autonomous Tools Integrated Reasoning via Reinforcement Learning"
☆54Sep 4, 2025Updated 10 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
xufangzhi / Genius
View on GitHub
[ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework
☆72Jun 1, 2025Updated last year
tianyi-lab / MiP-Overthinking
View on GitHub
[COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?
☆39Jun 5, 2025Updated last year
TIGER-AI-Lab / Hierarchical-Reasoner
View on GitHub
Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning [ICLR26]
☆64Apr 11, 2026Updated 3 months ago
WujiangXu / EPO
View on GitHub
The code for paper "EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning"
☆40Jul 13, 2026Updated last week
TIGER-AI-Lab / General-Reasoner
View on GitHub
General Reasoner: Advancing LLM Reasoning Across All Domains [NeurIPS25]
☆228Nov 27, 2025Updated 7 months ago
cometeme / funcoder
View on GitHub
Implementation for NeurIPS 2024 oral paper: Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation
☆16Jan 27, 2025Updated last year
JingMog / THOR
View on GitHub
[ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".
☆33Feb 26, 2026Updated 4 months ago
GAIR-NLP / ToRL
View on GitHub
☆352May 24, 2025Updated last year
TianHongZXY / RLVR-Decomposed
View on GitHub
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆165Mar 2, 2026Updated 4 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
satori-reasoning / Satori
View on GitHub
[ICML 2025] Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
☆114Jun 3, 2025Updated last year
TIGER-AI-Lab / AceCoder
View on GitHub
The official repo for "AceCoder: Acing Coder RL via Automated Test-Case Synthesis" [ACL25]
☆100Apr 9, 2025Updated last year
inclusionAI / PromptCoT
View on GitHub
A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…
☆132Jan 31, 2026Updated 5 months ago
uw-nsl / TinyV
View on GitHub
Your efficient and accurate answer verification system for RL training.
☆42Jun 23, 2025Updated last year
ltzheng / SimpleTIR
View on GitHub
[ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning
☆401Mar 30, 2026Updated 3 months ago
LINs-lab / ELICIT
View on GitHub
[ICLR 2025] ELICIT: LLM Augmentation Via External In-context Capability
☆14Mar 11, 2025Updated last year
TheRoadQaQ / ReLIFT
View on GitHub
Official Repository of "Learning what reinforcement learning can't"
☆84Dec 30, 2025Updated 6 months ago