NVlabs / NFTLinks
Implementation of Negative-aware Finetuning (NFT) algorithm for "Bridging Supervised Learning and Reinforcement Learning in Math Reasoning"
☆50Updated 2 months ago
Alternatives and similar repositories for NFT
Users that are interested in NFT are comparing it to the libraries listed below
Sorting:
- ☆62Updated last month
- TraceRL & TraDo-8B: Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models☆317Updated last week
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆47Updated 4 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 10 months ago
- ☆103Updated 2 months ago
- ☆35Updated 7 months ago
- ☆278Updated last month
- Multimodal RewardBench☆54Updated 9 months ago
- P1: Mastering Physics Olympiads with Reinforcement Learning☆36Updated last week
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆58Updated 8 months ago
- MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models☆36Updated last month
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆108Updated 3 weeks ago
- G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning☆88Updated 6 months ago
- ☆53Updated 5 months ago
- ☆112Updated this week
- Official implementation of "Diffusion Language Models Know the Answer Before Decoding"☆39Updated 2 months ago
- Official Repository of LatentSeek☆68Updated 5 months ago
- A Collection of Papers on Diffusion Language Models☆145Updated 2 months ago
- Geometric-Mean Policy Optimization☆92Updated this week
- The official github repo for "Diffusion Language Models are Super Data Learners".☆200Updated 2 weeks ago
- Code for "Reasoning to Learn from Latent Thoughts"☆122Updated 7 months ago
- ☆63Updated 6 months ago
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆185Updated this week
- [NeurIPS'25] The official code implementation for paper "R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Tok…☆57Updated 2 weeks ago
- ☆78Updated 5 months ago
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆76Updated 4 months ago
- MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.☆93Updated 2 months ago
- Uni-CoT: Towards Unified Chain-of-Thought Reasoning Across Text and Vision☆170Updated last week
- ☆106Updated 2 months ago
- The official code of "VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning" [NeurIPS25]☆166Updated 5 months ago