Minimalist RL for Diffusion LLMs with SOTA reasoning performance (89.1% GSM8K). Official implementation of "The Flexibility Trap".
☆130Jan 24, 2026Updated 2 months ago
Alternatives and similar repositories for JustGRPO
Users that are interested in JustGRPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Dec 19, 2024Updated last year
- ☆19Mar 5, 2025Updated last year
- ☆54Jan 2, 2025Updated last year
- IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance, ICCV 2025☆30Oct 1, 2025Updated 5 months ago
- [ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation☆35Sep 12, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official Implementation of wd1☆24Sep 25, 2025Updated 6 months ago
- Repository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)☆41Oct 30, 2023Updated 2 years ago
- Stable-DiffCoder is a family of lightweight open-source code DLLMs(diffusion large language models) comprising base and instruct models, …☆80Mar 9, 2026Updated 2 weeks ago
- Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)☆92Mar 16, 2023Updated 3 years ago
- Official implementation of A Mixture of Surprises for Unsupervised Reinforcement Learning☆23Nov 16, 2022Updated 3 years ago
- Official implementation of Dynamic Perceiver☆43Nov 16, 2023Updated 2 years ago
- Official implementation of BandPO: Bridging Trust Regions and Ratio Clipping via Probability-Aware Bounds for LLM Reinforcement Learning.…☆44Mar 9, 2026Updated 2 weeks ago
- [IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition☆53Mar 20, 2025Updated last year
- MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence☆56Mar 11, 2026Updated 2 weeks ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- CODA: Repurposing Continuous VAEs for Discrete Tokenization☆35Jul 4, 2025Updated 8 months ago
- Official code for the paper "HEXA-MoE: Efficient and Heterogeneous-Aware MoE Acceleration with Zero Computation Redundancy"☆15Mar 6, 2025Updated last year
- Jittor implementation of Vision Transformer with Deformable Attention☆32Mar 1, 2022Updated 4 years ago
- The open-source code for the NeurIPS 2025 paper, "Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learn…☆49Jan 5, 2026Updated 2 months ago
- Repository of GridMix (ICLR 2025)☆35Mar 18, 2025Updated last year
- [ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning☆32Sep 30, 2024Updated last year
- Explore Inter-layer Expert Affinity in MoE Model Inference☆16May 6, 2024Updated last year
- [AAAI 2026 Oral] SpatialActor: Exploring Disentangled Spatial Representations for Robust Robotic Manipulation☆61Jan 14, 2026Updated 2 months ago
- [TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition☆93Sep 30, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"☆47Jun 13, 2024Updated last year
- Source code for paper "Empirical Analysis of Decoding Biases in Masked Diffusion Models"☆39Jan 11, 2026Updated 2 months ago
- Official repository of Uni-AdaFocus (TPAMI 2024).☆61Dec 17, 2024Updated last year
- code for the paper Offline Prioritized Experience Replay☆12Jun 13, 2023Updated 2 years ago
- Implemenation of PQMass from Lemos et al. 2024☆20Apr 23, 2025Updated 11 months ago
- 哈工大软件构造课程总结笔记☆19Jul 16, 2018Updated 7 years ago
- The code repository of UniRL☆51May 30, 2025Updated 9 months ago
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆20Sep 24, 2025Updated 6 months ago
- ☆65Mar 7, 2026Updated 2 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Understand what physics/algorithms do transformers learn internally when trained on planetary motion☆39Feb 9, 2026Updated last month
- [NeurIPS 2024] Official repository of InLine attention☆59Dec 22, 2024Updated last year
- [NeurIPS 2025] Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking☆25Mar 18, 2026Updated last week
- ☆14Sep 11, 2025Updated 6 months ago
- ☆213Nov 26, 2025Updated 4 months ago
- MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with block diffusion, mixed-CoT, unified RL)☆1,615Feb 14, 2026Updated last month
- Sequential Diffusion Language Model (SDLM) enhances pre-trained autoregressive language models by adaptively determining generation lengt…☆93Dec 27, 2025Updated 3 months ago