MiroMindAI / MiroRLLinks
MiroRL is an MCP-first reinforcement learning framework for deep research agent.
☆74Updated this week
Alternatives and similar repositories for MiroRL
Users that are interested in MiroRL are comparing it to the libraries listed below
Sorting:
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆162Updated last week
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆34Updated this week
- ☆96Updated 3 months ago
- A version of verl to support tool use☆315Updated this week
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆49Updated 9 months ago
- Async pipelined version of Verl☆112Updated 4 months ago
- A Comprehensive Survey on Long Context Language Modeling☆170Updated last month
- ☆46Updated 2 months ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆138Updated 4 months ago
- ☆323Updated last week
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆188Updated 4 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoning☆64Updated 3 weeks ago
- ☆114Updated 2 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆218Updated 5 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆81Updated 2 months ago
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆61Updated 3 weeks ago
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆307Updated 3 months ago
- Resources for the Enigmata Project.☆59Updated 2 months ago
- A repo for open research on building large reasoning models☆87Updated this week
- ☆263Updated 2 months ago
- This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"☆36Updated last year
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆120Updated last month
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆170Updated last week
- ☆206Updated 5 months ago
- qwen-nsa☆71Updated 4 months ago
- siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems☆160Updated this week
- Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"☆41Updated last week
- ☆197Updated last week
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆163Updated 2 weeks ago
- ☆25Updated 4 months ago