pangu-tech / pangu-ultraLinks
☆67Updated 2 months ago
Alternatives and similar repositories for pangu-ultra
Users that are interested in pangu-ultra are comparing it to the libraries listed below
Sorting:
- ☆78Updated 4 months ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆136Updated last year
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆106Updated this week
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆113Updated 2 months ago
- ☆288Updated 2 months ago
- Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs☆183Updated last month
- ☆72Updated this week
- Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning☆188Updated 4 months ago
- ☆68Updated last month
- Repo for "Z1: Efficient Test-time Scaling with Code"☆63Updated 3 months ago
- A highly capable 2.4B lightweight LLM using only 1T pre-training data with all details.☆207Updated 2 weeks ago
- Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme☆138Updated 4 months ago
- ☆81Updated last week
- siiRL: Shanghai Innovation Institute RL Framework for Advanced LLMs and Multi-Agent Systems☆152Updated this week
- Simple extension on vLLM to help you speed up reasoning model without training.☆174Updated 2 months ago
- Parallel Scaling Law for Language Model — Beyond Parameter and Inference Time Scaling☆428Updated 2 months ago
- Efficient Agent Training for Computer Use☆122Updated 2 months ago
- ☆159Updated 3 months ago
- A lightweight reinforcement learning framework that integrates seamlessly into your codebase, empowering developers to focus on algorithm…☆34Updated this week
- Revisiting Mid-training in the Era of Reinforcement Learning Scaling☆161Updated 2 weeks ago
- Implementation for OAgents: An Empirical Study of Building Effective Agents☆101Updated last week
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆33Updated last year
- Chain of Experts (CoE) enables communication between experts within Mixture-of-Experts (MoE) models☆220Updated last month
- ☆94Updated 8 months ago
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆309Updated 2 months ago
- ☆90Updated 2 months ago
- ICML2025: Forest-of-Thought: Scaling Test-Time Compute for Enhancing LLM Reasoning☆46Updated 3 months ago
- ARM: Adaptive Reasoning Model☆45Updated last week
- The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆161Updated last month
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆51Updated last month