The official implementation for the intra-stage fusion technique introduced in https://arxiv.org/abs/2409.13221
☆31Apr 22, 2025Updated 11 months ago
Alternatives and similar repositories for FlexFusion
Users that are interested in FlexFusion are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- To pioneer training long-context multi-modal transformer models☆73Aug 8, 2025Updated 7 months ago
- ☆32Updated this week
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆21Updated this week
- ☆55Feb 5, 2026Updated last month
- Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.☆57Mar 4, 2026Updated 2 weeks ago
- NVSHMEM‑Tutorial: Build a DeepEP‑like GPU Buffer☆170Feb 11, 2026Updated last month
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆95Jan 16, 2026Updated 2 months ago
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆13Jan 16, 2026Updated 2 months ago
- Cute layout visualization☆33Jan 18, 2026Updated 2 months ago
- GenericMusicClient covering QQMusic, Netease, KuGou and etc.一个泛型的,集成了qq音乐,网易云,酷狗等在内的音乐NuGet库☆12Feb 2, 2023Updated 3 years ago
- ☆17May 28, 2024Updated last year
- Sample Codes using NVSHMEM on Multi-GPU☆30Jan 22, 2023Updated 3 years ago
- Ever wondered how popular your GitHub repo is compared to others?☆16Feb 14, 2026Updated last month
- Vortex: A Flexible and Efficient Sparse Attention Framework☆49Jan 21, 2026Updated 2 months ago
- UniVid: The Open-Source Unified Video Model☆30Oct 13, 2025Updated 5 months ago
- ☆21Oct 2, 2024Updated last year
- patches for huggingface transformers to save memory☆35Jun 2, 2025Updated 9 months ago
- Dynamic resources changes for multi-dimensional parallelism training☆30Aug 22, 2025Updated 7 months ago
- ☆13Jan 28, 2026Updated last month
- [USENIX Security '25] My ZIP isn’t your ZIP: Identifying and Exploiting Semantic Gaps Between ZIP Parsers☆38Aug 22, 2025Updated 7 months ago
- NexRL is an ultra-loosely-coupled LLM post-training framework.☆104Updated this week
- gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling☆55Updated this week
- Pipeline Parallelism Emulation and Visualization☆81Jan 8, 2026Updated 2 months ago
- ☆11Mar 9, 2026Updated 2 weeks ago
- ☆26Feb 17, 2025Updated last year
- Asynchronous pipeline parallel optimization☆19Feb 2, 2026Updated last month
- ☆52Apr 30, 2025Updated 10 months ago
- TaskMet Task-driven Metric Learning for Model Learning☆20Feb 9, 2024Updated 2 years ago
- ArXiv Today: Get arXiv daily papers right in your Lark (飞书) via bot.☆32Sep 17, 2025Updated 6 months ago
- GPU-accelerated LLM Training Simulator☆51Jun 26, 2025Updated 8 months ago
- Benchmark workloads of Boki☆11Sep 8, 2021Updated 4 years ago
- Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation☆118Feb 27, 2026Updated 3 weeks ago
- ☆29Feb 3, 2026Updated last month
- An eye-friendly dark theme for Typora.☆12Apr 22, 2022Updated 3 years ago
- 研究生课《网络大数据管理理论和应用》大作业项目代码☆13Dec 31, 2022Updated 3 years ago
- Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.☆456Updated this week
- DLSlime: Flexible & Efficient Heterogeneous Transfer Toolkit☆93Jan 26, 2026Updated last month
- ☆20Jul 16, 2024Updated last year
- Triton-based Symmetric Memory operators and examples☆94Jan 15, 2026Updated 2 months ago