ByteDance-Seed / VeOmniLinks
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
☆1,374Updated this week
Alternatives and similar repositories for VeOmni
Users that are interested in VeOmni are comparing it to the libraries listed below
Sorting:
- USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference☆605Updated last month
- Ring attention implementation with flash attention☆923Updated 2 months ago
- ☆439Updated 3 months ago
- Official implementation of "Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding"☆713Updated last week
- 🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"☆928Updated 8 months ago
- ☆819Updated 5 months ago
- A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training☆570Updated this week
- An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models☆2,414Updated this week
- The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.☆1,457Updated 3 weeks ago
- Muon is Scalable for LLM Training☆1,372Updated 4 months ago
- Super-Efficient RLHF Training of LLMs with Parameter Reallocation☆326Updated 7 months ago
- slime is an LLM post-training framework for RL Scaling.☆2,612Updated last week
- FlagScale is a large model toolkit based on open-sourced projects.☆416Updated last week
- ☆207Updated last month
- [Preprint] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.☆503Updated last month
- Official Repo for Open-Reasoner-Zero☆2,069Updated 6 months ago
- InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencie…☆414Updated 3 months ago
- Train speculative decoding models effortlessly and port them smoothly to SGLang serving.☆523Updated this week
- Fast inference from large lauguage models via speculative decoding☆859Updated last year
- Kimi-VL: Mixture-of-Experts Vision-Language Model for Multimodal Reasoning, Long-Context Understanding, and Strong Agent Capabilities☆1,119Updated 4 months ago
- ByteCheckpoint: An Unified Checkpointing Library for LFMs☆254Updated this week
- [ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.☆798Updated 2 weeks ago
- An Open-source RL System from ByteDance Seed and Tsinghua AIR☆1,653Updated 6 months ago
- Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)☆669Updated 2 months ago
- [ICLR 2025] DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads☆507Updated 9 months ago
- 📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉☆454Updated last week
- Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs☆893Updated last week
- A fast communication-overlapping library for tensor/expert parallelism on GPUs.☆1,182Updated 3 months ago
- A framework for efficient model inference with omni-modality models☆466Updated this week
- A fork to add multimodal model training to open-r1☆1,423Updated 9 months ago