To pioneer training long-context multi-modal transformer models
☆73Aug 8, 2025Updated 10 months ago
Alternatives and similar repositories for TeleTron
Users that are interested in TeleTron are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.☆65Mar 4, 2026Updated 3 months ago
- ☆18May 28, 2024Updated 2 years ago
- An experimental communicating attention kernel based on DeepEP.☆34Jul 29, 2025Updated 10 months ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆25Updated this week
- Ongoing research training transformer models at scale☆18Updated this week
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ☆37Updated this week
- Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation☆132Apr 28, 2026Updated last month
- The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …☆51Jun 6, 2025Updated last year
- Code for the paper "Interpreting and Improving Diffusion Models from an Optimization Perspective", appearing in ICML 2024☆15Sep 30, 2024Updated last year
- [EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization☆23Apr 13, 2026Updated last month
- ☆66Apr 26, 2025Updated last year
- Official PyTorch implementation for "Effective and Efficient Masked Image Generation Models"☆34Apr 8, 2025Updated last year
- DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.☆99Jan 16, 2026Updated 4 months ago
- This is a python library. Install with "python3 -m pip install rp" then run with "python3 -m rp" or just "rp". Requires python≥3.5☆13Jun 3, 2026Updated last week
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- DeeperGEMM: crazy optimized version☆86May 5, 2025Updated last year
- A lightweight Inference Engine built for block diffusion models☆46Apr 12, 2026Updated 2 months ago
- Toolchain built around the Megatron-LM for Distributed Training☆95May 20, 2026Updated 3 weeks ago
- Code for D-DiT☆67Apr 1, 2025Updated last year
- ☆14May 17, 2022Updated 4 years ago
- Tiny-FSDP, a minimalistic re-implementation of the PyTorch FSDP☆108Aug 20, 2025Updated 9 months ago
- A PyTorch implementation of EMANet based on ICCV 2019 paper "Expectation-Maximization Attention Networks for Semantic Segmentation"☆18Feb 21, 2020Updated 6 years ago
- [ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation☆443May 30, 2025Updated last year
- Unofficial implementation of Face0 with SDXL☆12Sep 1, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Dynamic resources changes for multi-dimensional parallelism training☆31Aug 22, 2025Updated 9 months ago
- A LLaMA1/LLaMA12 Megatron implement.☆28Dec 13, 2023Updated 2 years ago
- [Archived] For the latest updates and community contribution, please visit: https://github.com/Ascend/TransferQueue or https://gitcode.co…☆16Jan 16, 2026Updated 4 months ago
- Complete simulation of IEEE 754 fixed and floating point specification to any precision