Tencent / digitalhumanLinks
☆150Updated 2 weeks ago
Alternatives and similar repositories for digitalhuman
Users that are interested in digitalhuman are comparing it to the libraries listed below
Sorting:
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆236Updated 2 months ago
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.☆158Updated last month
- Efficient Agent Training for Computer Use☆131Updated last month
- Towards a Unified View of Large Language Model Post-Training☆167Updated last month
- ☆107Updated 5 months ago
- ☆169Updated 5 months ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆277Updated 2 weeks ago
- [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆170Updated 3 months ago
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆127Updated 7 months ago
- Official implementation for "ScoreFlow: Mastering LLM Agent Workflows via Score-based Preference Optimization"☆87Updated 5 months ago
- REverse-Engineered Reasoning for Open-Ended Generation☆75Updated last month
- ☆84Updated 6 months ago
- A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning☆263Updated last month
- 🔧Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning☆279Updated this week
- MiroThinker is open-source agentic models trained for deep research and complex tool use scenarios.☆467Updated last week
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆145Updated 4 months ago
- MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search too…☆335Updated 2 months ago
- SSRL: Self-Search Reinforcement Learning☆147Updated 2 months ago
- PSFT is a trust-region–inspired fine-tuning objective that views SFT as a policy gradient method with constant advantages, constraining p…☆27Updated last month
- R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning☆65Updated 5 months ago
- ☆73Updated 4 months ago
- End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning☆308Updated last month
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆115Updated 5 months ago
- MPO: Boosting LLM Agents with Meta Plan Optimization (EMNLP 2025 Findings)☆72Updated 2 months ago
- ☆82Updated 2 months ago
- ☆89Updated 5 months ago
- MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.☆88Updated last month
- Scaling Preference Data Curation via Human-AI Synergy☆116Updated 3 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆129Updated 8 months ago
- [ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis☆163Updated 2 weeks ago