stepfun-ai / StepDeepResearchLinks
Step-DeepResearch
☆211Updated this week
Alternatives and similar repositories for StepDeepResearch
Users that are interested in StepDeepResearch are comparing it to the libraries listed below
Sorting:
- MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.☆99Updated 4 months ago
- ☆190Updated last week
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆246Updated 4 months ago
- [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆187Updated 5 months ago
- Deep Research Agent CognitiveKernel-Pro from Tencent AI Lab. Paper: https://arxiv.org/pdf/2508.00414☆476Updated 2 months ago
- PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning☆222Updated 2 weeks ago
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆163Updated 3 months ago
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆329Updated 6 months ago
- MiroRL is an MCP-first reinforcement learning framework for deep research agent.☆184Updated 4 months ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆297Updated 2 months ago
- MiroThinker is a series of open-source agentic models trained for deep research and complex tool use scenarios.☆1,352Updated last week
- Towards a Unified View of Large Language Model Post-Training☆197Updated 3 months ago
- The official implementation of the ICML 2024 paper "MemoryLLM: Towards Self-Updatable Large Language Models" and "M+: Extending MemoryLLM…☆280Updated 5 months ago
- [FSE'2026] SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆122Updated last month
- ☆75Updated 6 months ago
- Efficient Agent Training for Computer Use☆134Updated 3 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆368Updated 4 months ago
- ☆331Updated 4 months ago
- CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)☆169Updated last month
- Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments☆46Updated 3 months ago
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agents☆256Updated last month
- ✨✨R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning☆272Updated 7 months ago
- ☆85Updated 8 months ago
- ☆93Updated 7 months ago
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆62Updated 2 months ago
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆280Updated 3 months ago
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆345Updated 3 months ago
- REverse-Engineered Reasoning for Open-Ended Generation☆84Updated 3 months ago
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆182Updated 5 months ago
- Scaling Preference Data Curation via Human-AI Synergy☆133Updated 5 months ago