sail-sg/variational-reasoning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sail-sg/variational-reasoning)

sail-sg / variational-reasoning

Code for "Variational Reasoning for Language Models"

☆60

Alternatives and similar repositories for variational-reasoning

Users that are interested in variational-reasoning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

sail-sg / feedback-conditional-policy
View on GitHub
Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"
☆65Jan 5, 2026Updated 6 months ago
sail-sg / ActivePRM
View on GitHub
☆21Apr 16, 2025Updated last year
sail-sg / AnytimeReasoner
View on GitHub
Optimizing Anytime Reasoning via Budget Relative Policy Optimization
☆54Jul 15, 2025Updated last year
haonan3 / V1
View on GitHub
V1: Toward Multimodal Reasoning by Designing Auxiliary Task
☆36Apr 14, 2025Updated last year
sail-sg / P-DoS
View on GitHub
[ArXiv 2025] Denial-of-Service Poisoning Attacks on Large Language Models
☆23Oct 22, 2024Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
sail-sg / VeriFree
View on GitHub
Reinforcing General Reasoning without Verifiers
☆102Jun 24, 2025Updated last year
real-absolute-AI / Unnatural_Language
View on GitHub
The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'
☆24May 20, 2025Updated last year
JinjieNi / Quokka
View on GitHub
The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models sca…
☆46Nov 6, 2025Updated 8 months ago
sail-sg / Video-Next-Event-Prediction
View on GitHub
☆28Aug 9, 2025Updated 11 months ago
sail-sg / LightTrans
View on GitHub
The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"
☆22Apr 22, 2025Updated last year
sail-sg / SkyLadder
View on GitHub
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆43Dec 29, 2025Updated 6 months ago
YisongMiao / DiSQ-Score
View on GitHub
The Dataset and Official Implementation for <Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understandi…
☆18Aug 7, 2024Updated last year
sail-sg / imperceptible-jailbreaks
View on GitHub
[ArXiv 2025] Imperceptible Jailbreaking against Large Language Models
☆25Oct 7, 2025Updated 9 months ago
sail-sg / D-TRAK
View on GitHub
Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)
☆39Jan 23, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
complex-reasoning / RPG
View on GitHub
[ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)
☆76Jun 29, 2026Updated 3 weeks ago
sail-sg / Cheating-LLM-Benchmarks
View on GitHub
[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)
☆86Oct 23, 2024Updated last year
sail-sg / Precision-RL
View on GitHub
Defeating the Training-Inference Mismatch via FP16
☆197Nov 14, 2025Updated 8 months ago
sail-sg / I-FSJ
View on GitHub
Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)
☆65Jan 11, 2025Updated last year
real-absolute-AI / NoisyRollout
View on GitHub
[NeurIPS 2025] NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation
☆112Sep 18, 2025Updated 10 months ago
sail-sg / Agent-Smith
View on GitHub
[ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
☆123Mar 26, 2024Updated 2 years ago
sail-sg / dice
View on GitHub
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
☆47Apr 15, 2025Updated last year
real-absolute-AI / SynthRL
View on GitHub
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis
☆70Jul 24, 2025Updated last year
sail-sg / tty-use
View on GitHub
☆15Oct 13, 2025Updated 9 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
sail-sg / Stable-RL
View on GitHub
Rethinking the Trust Region in LLM Reinforcement Learning
☆62Mar 2, 2026Updated 4 months ago
sail-sg / DiffMemorize
View on GitHub
[TMLR 2025] On Memorization in Diffusion Models
☆33Oct 5, 2023Updated 2 years ago
shengliu66 / FractionalReason
View on GitHub
Official github repo for "Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute"
☆17Jun 30, 2025Updated last year
vincentamato / mlx-esm-2
View on GitHub
An MLX implementation of Meta AI's ESM-2 protein language model
☆16Aug 16, 2025Updated 11 months ago
Simplified-Reasoning / LUFFY
View on GitHub
Official Repository of "Learning to Reason under Off-Policy Guidance"
☆460Mar 20, 2026Updated 4 months ago
BaohaoLiao / frac-cot
View on GitHub
[COLM 2026] An efficient 3D sampling method for long-CoT LLM.
☆16May 25, 2025Updated last year
MasterVito / SvS
View on GitHub
Official Repo for SvS: A Self-play with Variational Problem Synthesis strategy for RLVR training
☆54Dec 13, 2025Updated 7 months ago
mukhal / ThinkPRM
View on GitHub
[TMLR] Process Reward Models That Think
☆89Nov 29, 2025Updated 7 months ago
axon-rl / gem
View on GitHub
A Gym for Agentic LLMs
☆502Jan 21, 2026Updated 6 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
sail-sg / sailor2
View on GitHub
🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
☆73Mar 21, 2025Updated last year
TsinghuaC3I / Unify-Post-Training
View on GitHub
Towards a Unified View of Large Language Model Post-Training
☆211Sep 8, 2025Updated 10 months ago
ritzz-ai / PACS
View on GitHub
☆31Sep 12, 2025Updated 10 months ago
anpaure / cp_eval
View on GitHub
Tiny evaluation of leading LLMs on competitive programming problems
☆14Apr 10, 2026Updated 3 months ago
duchao0726 / Conditional-SWF
View on GitHub
☆13Jul 25, 2023Updated 3 years ago
sparkle-reasoning / sparkle
View on GitHub
[NeurIPS'25] Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning
☆16Dec 12, 2025Updated 7 months ago
terminal-agent / reptile
View on GitHub
💻 Terminal-Agent with Human-in-the-Loop Learning
☆41Jan 16, 2026Updated 6 months ago