☆147Jan 20, 2026Updated last month
Alternatives and similar repositories for DiRL
Users that are interested in DiRL are comparing it to the libraries listed below
Sorting:
- ☆24May 23, 2025Updated 9 months ago
- [ASPLOS'26] Taming the Long-Tail: Efficient Reasoning RL Training with Adaptive Drafter☆138Dec 5, 2025Updated 2 months ago
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning☆42Sep 3, 2025Updated 5 months ago
- MOSS-Audio-Tokenizer is a Causal Transformer-based audio tokenizer built on the CAT architecture. Trained on 3M hours of diverse audio, i…☆126Feb 13, 2026Updated 2 weeks ago
- Modality Gap–Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models☆51Updated this week
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Feb 9, 2026Updated 2 weeks ago
- [NeurIPS 2025] ScaleKV: Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression☆50Nov 4, 2025Updated 3 months ago
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆29Dec 24, 2025Updated 2 months ago
- Aligning Agentic World Models via Knowledgeable Experience Learning☆31Jan 25, 2026Updated last month
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated last month
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- The official github repo for "Training Optimal Large Diffusion Language Models", the first-ever large-scale diffusion language models sca…☆45Nov 6, 2025Updated 3 months ago
- Sequential Diffusion Language Model (SDLM) enhances pre-trained autoregressive language models by adaptively determining generation lengt…☆90Dec 27, 2025Updated 2 months ago
- Dimple, the first Discrete Diffusion Multimodal Large Language Model☆115Jul 9, 2025Updated 7 months ago
- CoV: Chain-of-View Prompting for Spatial Reasoning☆51Jan 23, 2026Updated last month
- ☆44Feb 12, 2026Updated 2 weeks ago
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Oct 27, 2024Updated last year
- Agent-RRM: Exploring Reasoning Reward Model for Agents☆44Feb 4, 2026Updated 3 weeks ago
- Fast, memory-efficient attention column reduction (e.g., sum, mean, max)☆37Feb 10, 2026Updated 2 weeks ago
- Official Codebase For paper "One-step Language Modeling via Continuous Denoising"☆48Updated this week
- Open diffusion language model for code generation — releasing pretraining, evaluation, inference, and checkpoints.☆523Nov 11, 2025Updated 3 months ago
- Code and Data for "FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation" (ACL25)☆29Oct 26, 2025Updated 4 months ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward☆92Aug 8, 2025Updated 6 months ago
- ☆87Aug 16, 2025Updated 6 months ago
- [NeurIPS 2025] IEAP: Image Editing As Programs with Diffusion Models☆113Sep 27, 2025Updated 5 months ago
- The official implementation of Preference Data Reward-Augmentation.☆18May 1, 2025Updated 9 months ago
- Official PyTorch implementation of RACRO (https://www.arxiv.org/abs/2506.04559)☆19Jul 1, 2025Updated 8 months ago
- ☆41Jan 4, 2026Updated last month
- [ICLR 2026] Official code for TraceRL: Revolutionizing post-training for Diffusion LLMs, powering the SOTA TraDo series.☆435Jan 28, 2026Updated last month
- MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head (ICLR 2026)☆123Feb 6, 2026Updated 3 weeks ago
- [NeurIPS 2025] Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains☆79Jul 29, 2025Updated 7 months ago
- [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models☆42Oct 28, 2025Updated 4 months ago
- Minimalist RL for Diffusion LLMs with SOTA reasoning performance (89.1% GSM8K). Official implementation of "The Flexibility Trap".☆119Jan 24, 2026Updated last month
- ☆20May 7, 2025Updated 9 months ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Jul 4, 2025Updated 7 months ago
- ☆16Jul 23, 2024Updated last year