Minimalist RL for Diffusion LLMs with SOTA reasoning performance (89.1% GSM8K). Official implementation of "The Flexibility Trap".
☆126Jan 24, 2026Updated last month
Alternatives and similar repositories for JustGRPO
Users that are interested in JustGRPO are comparing it to the libraries listed below
Sorting:
- ☆14Dec 19, 2024Updated last year
- ☆19Mar 5, 2025Updated 11 months ago
- [ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation☆35Sep 12, 2024Updated last year
- MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence☆55Feb 10, 2026Updated 3 weeks ago
- [CVPR 2026] Official repository of Vision Test-Time Training☆54Feb 21, 2026Updated last week
- Image Tokenizer Needs Post-Training☆24Oct 4, 2025Updated 4 months ago
- ☆53Jan 2, 2025Updated last year
- DeepTrace: A lightweight, scalable real-time diagnostic and analysis tool for distributed training tasks.☆18Nov 4, 2025Updated 3 months ago
- IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance, ICCV 2025☆30Oct 1, 2025Updated 5 months ago
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆29Oct 9, 2025Updated 4 months ago
- Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"☆125Feb 14, 2025Updated last year
- Repository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)☆40Oct 30, 2023Updated 2 years ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- Official implementation of Dynamic Perceiver☆43Nov 16, 2023Updated 2 years ago
- Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)☆92Mar 16, 2023Updated 2 years ago
- [IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition☆53Mar 20, 2025Updated 11 months ago
- ☆147Jan 20, 2026Updated last month
- CODA: Repurposing Continuous VAEs for Discrete Tokenization☆35Jul 4, 2025Updated 7 months ago
- [AAAI 2026 Oral] The official code of "UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning"☆64Dec 8, 2025Updated 2 months ago
- Sequential Diffusion Language Model (SDLM) enhances pre-trained autoregressive language models by adaptively determining generation lengt…☆90Dec 27, 2025Updated 2 months ago
- Jittor implementation of Vision Transformer with Deformable Attention☆32Mar 1, 2022Updated 4 years ago
- Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types☆32Jul 16, 2025Updated 7 months ago
- [ICLR 2026] dParallel: Learnable Parallel Decoding for dLLMs☆59Feb 22, 2026Updated last week
- Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning☆45Jul 2, 2025Updated 8 months ago
- 📚 A curated list of Awesome Efficient dLLMs Papers with Codes☆110Updated this week
- ☆179Jun 27, 2025Updated 8 months ago
- [ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning☆32Sep 30, 2024Updated last year
- Repository of GridMix (ICLR 2025)☆35Mar 18, 2025Updated 11 months ago
- [Arxiv 2024] Official code for MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions☆33Feb 6, 2025Updated last year
- [ICLR 2025] A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents☆90Feb 2, 2026Updated last month
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆852Jan 28, 2026Updated last month
- MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment☆35Jul 1, 2024Updated last year
- WorldSense: Evaluating Real-world Omnimodal Understanding for Multimodal LLMs☆41Jan 26, 2026Updated last month
- Official implementation of FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment☆28Updated this week
- ☆14Feb 13, 2026Updated 2 weeks ago
- DiffThinker: Towards Generative Multimodal Reasoning with Diffusion Models☆170Jan 4, 2026Updated last month
- The code repository of UniRL☆51May 30, 2025Updated 9 months ago
- Defeating the Training-Inference Mismatch via FP16☆182Nov 14, 2025Updated 3 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆84Jan 24, 2024Updated 2 years ago