NVlabs / Long-RLLinks
Long-RL: Scaling RL to Long Sequences
☆568Updated this week
Alternatives and similar repositories for Long-RL
Users that are interested in Long-RL are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation☆373Updated 3 months ago
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆199Updated 3 months ago
- Official implementation of UnifiedReward & UnifiedReward-Think☆493Updated this week
- ☆188Updated this week
- [Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey☆446Updated 6 months ago
- [CVPR 2025] 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆364Updated 2 weeks ago
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"☆272Updated 3 months ago
- Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]☆646Updated last week
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"☆390Updated last month
- EVE Series: Encoder-Free Vision-Language Models from BAAI☆342Updated last week
- [ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation☆452Updated 8 months ago
- ✈️ [ICCV 2025] Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints☆72Updated 3 weeks ago
- Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning☆198Updated 2 months ago
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆308Updated 2 months ago
- [ICLR 2025] Autoregressive Video Generation without Vector Quantization☆553Updated 3 weeks ago
- Visual Planning: Let's Think Only with Images☆262Updated 2 months ago
- A Unified Tokenizer for Visual Generation and Understanding☆371Updated this week
- A Collection of Papers on Diffusion Language Models☆97Updated last month
- The official repository for our paper, "Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning".☆125Updated 3 weeks ago
- Pixel-Level Reasoning Model trained with RL☆180Updated last month
- 📖 This is a repository for organizing papers, codes, and other resources related to unified multimodal models.☆268Updated this week
- An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation☆537Updated this week
- Empowering Unified MLLM with Multi-granular Visual Generation☆127Updated 6 months ago
- Long Context Transfer from Language to Vision☆388Updated 4 months ago
- Official implementation of the Law of Vision Representation in MLLMs☆163Updated 8 months ago
- Official repo and evaluation implementation of VSI-Bench☆555Updated last month
- [ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Langua…☆468Updated 7 months ago
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆320Updated this week
- Survey: https://arxiv.org/pdf/2507.20198☆46Updated last week
- Explore the Multimodal “Aha Moment” on 2B Model☆605Updated 4 months ago