Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training
☆54Dec 13, 2025Updated 3 months ago
Alternatives and similar repositories for SvS
Users that are interested in SvS are comparing it to the libraries listed below
Sorting:
- [NeurIPS'25] The official code of "PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning"☆30Jan 12, 2026Updated 2 months ago
- Official Repository of LatentSeek☆78Jun 6, 2025Updated 9 months ago
- Your efficient and accurate answer verification system for RL training.☆41Jun 23, 2025Updated 9 months ago
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆21Jun 13, 2025Updated 9 months ago
- Interpretable Contrastive Monte Carlo Tree Search Reasoning☆51Nov 9, 2024Updated last year
- ☆70Jun 18, 2025Updated 9 months ago
- LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding☆36Jan 16, 2026Updated 2 months ago
- ☆35Jan 25, 2026Updated last month
- [ICLR 26] Visual Multi-Agent System: Mitigating Hallucination Snowballing via Visual Flow☆36Oct 3, 2025Updated 5 months ago
- Source code for SWIFT, an efficient reward model.☆19Jan 13, 2026Updated 2 months ago
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆35Aug 28, 2025Updated 6 months ago
- [ICLR 2026] Learning to Parallel: Accelerating Diffusion Large Language Models via Learnable Parallel Decoding☆31Jan 27, 2026Updated last month
- Official codebase for the paper "Reasoning Within the Mind: Dynamic Multimodal Interleaving in Latent Space"☆69Dec 17, 2025Updated 3 months ago
- ViLoMem: Agentic Learner with Grow-and-Refine Multimodal Semantic Memory☆57Nov 27, 2025Updated 3 months ago
- The official implement of "Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings"☆18Dec 5, 2024Updated last year
- Official Code For EMNLP2025 Findings: {DLPO : Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Le…☆10Dec 25, 2025Updated 2 months ago
- [ACM MM25] LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models☆23Mar 29, 2025Updated 11 months ago
- ABC: Achieving Better Control of Multimodal Embeddings using VLMs [TMLR2025]☆21Aug 21, 2025Updated 7 months ago
- Bayes-Adaptive RL for LLM Reasoning