A lightweight, AI-native training framework for large language models. Designed for fast iteration, reproducible experiments, and modular configuration across SFT, RLVR, and evaluation workflows.
☆329Mar 4, 2026Updated this week
Alternatives and similar repositories for SteptronOss
Users that are interested in SteptronOss are comparing it to the libraries listed below
Sorting:
- My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.☆32Dec 5, 2025Updated 3 months ago
- VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection☆25May 31, 2025Updated 9 months ago
- An Android Application for GLCC☆11Sep 30, 2022Updated 3 years ago
- An experimental communicating attention kernel based on DeepEP.☆35Jul 29, 2025Updated 7 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆37Oct 3, 2025Updated 5 months ago
- The official implementation of "Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization"☆16Mar 14, 2024Updated last year
- Personal solutions to the Triton Puzzles☆20Jul 18, 2024Updated last year
- ☆49Sep 26, 2025Updated 5 months ago
- PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning☆328Feb 5, 2026Updated last month
- Pile Deduplication Code☆18May 15, 2023Updated 2 years ago
- A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.☆98Dec 17, 2025Updated 2 months ago
- MegEngine build with cu11x☆17Mar 13, 2023Updated 2 years ago
- implementation of dualformer☆24Mar 1, 2025Updated last year
- Large language models designed for formal theorem proving through tool-integrated reasoning.☆33Aug 13, 2025Updated 6 months ago
- ☆107Feb 25, 2025Updated last year
- Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)☆27Oct 3, 2025Updated 5 months ago
- Training library for Megatron-based models with bidirectional Hugging Face conversion capability☆481Updated this week
- Scalable toolkit for efficient model reinforcement☆1,372Updated this week
- ☆42Dec 16, 2025Updated 2 months ago
- 🚀 Efficient implementations of state-of-the-art linear attention models☆4,474Updated this week
- ring-attention experiments☆166Oct 17, 2024Updated last year
- Ring attention implementation with flash attention☆987Sep 10, 2025Updated 5 months ago
- Distributed Compiler based on Triton for Parallel Systems☆1,371Feb 13, 2026Updated 3 weeks ago
- Jacobi Forcing: Fast and Accurate Diffusion-style Decoding☆142Feb 20, 2026Updated 2 weeks ago
- Muon is Scalable for LLM Training☆1,440Aug 3, 2025Updated 7 months ago
- An efficient GRPO training util.☆54Jun 13, 2025Updated 8 months ago
- ☆129Jun 6, 2025Updated 9 months ago
- Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels☆5,284Feb 28, 2026Updated last week
- Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective (ACL 2024)☆57Oct 28, 2024Updated last year
- ☆136Jan 26, 2026Updated last month
- ☆42Aug 3, 2025Updated 7 months ago
- slime is an LLM post-training framework for RL Scaling.☆4,536Updated this week
- EvaByte: Efficient Byte-level Language Models at Scale☆115Apr 22, 2025Updated 10 months ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆184May 20, 2025Updated 9 months ago
- 超简单复现Deepseek-R1-Zero和Deepseek-R1,以「24点游戏」为例。通过zero-RL、SFT以及SFT+RL,以激发LLM的自主验证反思能力。 About Clean, minimal, accessible reproduction of Dee…☆34Apr 5, 2025Updated 11 months ago
- ☆114Sep 13, 2025Updated 5 months ago
- Reproducible, flexible LLM evaluations☆347Jan 28, 2026Updated last month
- Codes for Understanding Architectures Learnt by Cell-based Neural Architecture Search☆28Feb 6, 2020Updated 6 years ago
- Open-Pandora: On-the-fly Control Video Generation☆35Nov 28, 2024Updated last year