A lightweight, AI-native training framework for large language models. Designed for fast iteration, reproducible experiments, and modular configuration across SFT, RLVR, and evaluation workflows.
☆507Mar 18, 2026Updated last week
Alternatives and similar repositories for SteptronOss
Users that are interested in SteptronOss are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- My attempt to improve the speed of the newton schulz algorithm, starting from the dion implementation.☆33Dec 5, 2025Updated 3 months ago
- Toolathlon-Gym for testing AI agents real-world tool-use capabilities across diverse MCP servers.☆100Mar 21, 2026Updated last week
- A NCCL extension library, designed to efficiently offload GPU memory allocated by the NCCL communication library.☆100Dec 17, 2025Updated 3 months ago
- A codebase & model zoo for pretrained backbone based on MegEngine.☆32Mar 6, 2023Updated 3 years ago
- Large language models designed for formal theorem proving through tool-integrated reasoning.☆33Aug 13, 2025Updated 7 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Python SDK for dataset generation on LightningRod platform ⚡☆40Updated this week
- An experimental communicating attention kernel based on DeepEP.☆35Jul 29, 2025Updated 8 months ago
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- ☆453Aug 10, 2025Updated 7 months ago
- An official implementation of Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards☆37Oct 3, 2025Updated 5 months ago
- PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning☆335Feb 5, 2026Updated last month
- ☆48Aug 29, 2024Updated last year
- MegEngine Official Documentation☆39Dec 4, 2024Updated last year
- The official implementation of "Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization"☆16Mar 14, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [🚀 ICLR 2026 Oral] NextStep-1: SOTA Autogressive Image Generation with Continuous Tokens. A research project developed by the StepFun’s …☆652Feb 27, 2026Updated last month
- An object detection codebase based on MegEngine.☆28Dec 14, 2022Updated 3 years ago
- Pytorch Distributed native training library for LLMs/VLMs with OOTB Hugging Face support☆395Updated this week
- An Android Application for GLCC☆11Sep 30, 2022Updated 3 years ago
- ☆51Sep 26, 2025Updated 6 months ago
- Distributed Compiler based on Triton for Parallel Systems☆1,398Mar 11, 2026Updated 2 weeks ago
- PyTorch building blocks for the OLMo ecosystem☆1,000Updated this week
- ☆12Apr 29, 2024Updated last year
- An efficient implementation of the NSA (Native Sparse Attention) kernel☆132Jun 24, 2025Updated 9 months ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- The training codes of Jasper-Token-Compression-600M☆19Nov 19, 2025Updated 4 months ago
- STEP-GUI: The top GUI agent solution in the galaxy. Developed by the StepFun-GELab team and powered by StepFun’s cutting-edge research c…☆2,101Mar 14, 2026Updated 2 weeks ago
- ☆31Dec 31, 2025Updated 2 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆117Apr 22, 2025Updated 11 months ago
- A Deep Learning Project about cats.☆11Aug 8, 2022Updated 3 years ago
- Improving Math reasoning through Direct Preference Optimization with Verifiable Pairs☆19Mar 20, 2025Updated last year
- A minimal CLI tool for piping anything into an LLM.☆20Jan 1, 2026Updated 2 months ago
- Codes for "Efficient Offline Policy Optimization with a Learned Model", ICLR2023☆30Jul 18, 2023Updated 2 years ago
- mperf是一个面向移动/嵌入式平台的算子性能调优工具箱☆193Aug 17, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official Repo for Open-Reasoner-Zero☆2,088Jun 2, 2025Updated 9 months ago
- Ring attention implementation with flash attention☆998Sep 10, 2025Updated 6 months ago
- Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels☆5,432Updated this week
- Training library for Megatron-based models with bidirectional Hugging Face conversion capability☆531Updated this week
- A set of examples around MegEngine☆31Dec 8, 2023Updated 2 years ago
- dotfiles for frontend-developer and python-user, including: vim(support vue files and python pylint), tmux, zsh(with oh-my-zsh)☆11Mar 23, 2026Updated last week
- Jacobi Forcing: Fast and Accurate Diffusion-style Decoding☆152Feb 20, 2026Updated last month