MiroRL is an MCP-first reinforcement learning framework for deep research agent.
☆233Aug 27, 2025Updated 6 months ago
Alternatives and similar repositories for MiroRL
Users that are interested in MiroRL are comparing it to the libraries listed below
Sorting:
- MiroTrain is an efficient and algorithm-first framework research agent.☆133Aug 27, 2025Updated 6 months ago
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆256Aug 12, 2025Updated 6 months ago
- ☆335May 24, 2025Updated 9 months ago
- ☆87Aug 16, 2025Updated 6 months ago
- A version of verl to support diverse tool use☆879Feb 19, 2026Updated last week
- ☆34Nov 26, 2025Updated 3 months ago
- [ICLR 2026] End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆358Jan 12, 2026Updated last month
- Async pipelined version of Verl☆124Apr 8, 2025Updated 10 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆368Feb 19, 2026Updated last week
- Vortex: A Flexible and Efficient Sparse Attention Framework☆48Jan 21, 2026Updated last month
- Prompt-R1: Collaborative Automatic Prompting Framework via End-to-end Reinforcement Learning☆54Updated this week
- Asynchronous pipeline parallel optimization☆19Feb 2, 2026Updated last month
- Scaling Long-Horizon LLM Agent via Context-Folding☆117Jan 26, 2026Updated last month
- AI model training on heterogeneous, geo-distributed resources☆37Nov 24, 2025Updated 3 months ago
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆13Jan 16, 2026Updated last month
- DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit☆92Jan 26, 2026Updated last month
- Prefix-Aware Attention for LLM Decoding☆29Jan 23, 2026Updated last month
- ☆13Feb 22, 2023Updated 3 years ago
- NexRL is an ultra-loosely-coupled LLM post-training framework.☆98Feb 14, 2026Updated 2 weeks ago
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆314Feb 3, 2026Updated 3 weeks ago
- Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.☆3,586Updated this week
- Scaling Deep Research via Reinforcement Learning in Real-world Environments.☆705Oct 15, 2025Updated 4 months ago
- [COLM 2025] Official repository for R2E-Gym: Procedural Environment Generation and Hybrid Verifiers for Scaling Open-Weights SWE Agents☆243Jul 13, 2025Updated 7 months ago
- Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.☆841May 14, 2025Updated 9 months ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 2 years ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Jul 4, 2025Updated 7 months ago
- Resources for the Enigmata Project.☆77Aug 13, 2025Updated 6 months ago
- Scalable RL solution for advanced reasoning of language models☆1,809Mar 18, 2025Updated 11 months ago
- BytePS examples (Vision, NLP, GAN, etc)☆19Nov 24, 2022Updated 3 years ago
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆331Apr 24, 2025Updated 10 months ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆91Feb 23, 2026Updated last week
- Surrogate-based Hyperparameter Tuning System☆28Jun 29, 2023Updated 2 years ago
- A flexible and efficient training framework for large-scale alignment tasks☆450Oct 23, 2025Updated 4 months ago
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆15Feb 9, 2026Updated 2 weeks ago
- The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"☆23Oct 14, 2025Updated 4 months ago
- A tool for cross-checking Verilog compilers☆14Apr 16, 2025Updated 10 months ago
- A series of technical report on Slow Thinking with LLM☆760Aug 13, 2025Updated 6 months ago
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learning☆123May 6, 2025Updated 9 months ago
- Large Reasoning Models☆807Dec 3, 2024Updated last year