NVlabs / NFTLinks
Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasoning"
☆44Updated last month
Alternatives and similar repositories for NFT
Users that are interested in NFT are comparing it to the libraries listed below
Sorting:
- TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆289Updated 2 weeks ago
- ☆61Updated last week
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆47Updated 3 months ago
- ☆100Updated last month
- ☆103Updated this week
- ☆51Updated 4 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆58Updated 7 months ago
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆86Updated 5 months ago
- ☆262Updated 2 weeks ago
- Code for "Reasoning to Learn from Latent Thoughts"☆121Updated 7 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 9 months ago
- Remasking Discrete Diffusion Models with Inference-Time Scaling☆51Updated 7 months ago
- A Collection of Papers on Diffusion Language Models☆137Updated last month
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆132Updated 3 months ago
- Implementation of "Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models"☆59Updated 3 months ago
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆172Updated last month
- ☆104Updated last month
- paper list, tutorial, and nano code snippet for Diffusion Large Language Models.☆123Updated 4 months ago
- Multimodal RewardBench☆54Updated 8 months ago
- [NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models☆112Updated 5 months ago
- ☆35Updated 6 months ago
- Official Repository of LatentSeek☆65Updated 4 months ago
- The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models sca…☆33Updated this week
- ☆61Updated 3 months ago
- Geometric-Mean Policy Optimization☆88Updated 2 weeks ago
- V1: Toward Multimodal Reasoning by Designing Auxiliary Task☆36Updated 6 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆162Updated 4 months ago
- ☆45Updated last month
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆75Updated 3 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"☆183Updated 7 months ago