Tele-AI/TeleTron

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Tele-AI/TeleTron)

Tele-AI / TeleTron

To pioneer training long-context multi-modal transformer models

☆75

Alternatives and similar repositories for TeleTron

Users that are interested in TeleTron are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

AI-Infra-Team / awesome-papers
View on GitHub
Paper reading and discussion notes, covering AI frameworks, distributed systems, cluster management, etc.
☆69Mar 4, 2026Updated 4 months ago
FlexFusion / FlexFusion
View on GitHub
The official implementation for the intra-stage fusion technique introduced in https://arxiv.org/abs/2409.13221
☆31Apr 22, 2025Updated last year
lhb8125 / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆19Updated this week
Tele-AI / Fluxon
View on GitHub
An AI-native distributed data plane built in Rust for high performance RPC, KV Cache, Message Queue, and File & Object Acceleration
☆116Updated this week
KuangjuX / AttnLink
View on GitHub
An experimental communicating attention kernel based on DeepEP.
☆34Jul 29, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
CGCL-codes / streambox
View on GitHub
☆18May 28, 2024Updated 2 years ago
Infrasys-AI / aiinfra-docs
View on GitHub
☆21Nov 6, 2025Updated 8 months ago
chenyu-jiang / dcp
View on GitHub
Code repository for the SOSP'25 paper DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism.
☆21Nov 28, 2025Updated 8 months ago
Peyton-Chen / Sparse-vDiT
View on GitHub
The official implementation of "Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers" (arXiv …
☆52Jun 6, 2025Updated last year
OpenSQZ / MegatronApp
View on GitHub
Toolchain built around the Megatron-LM for Distributed Training
☆97May 20, 2026Updated 2 months ago
Tele-AI / MMPL
View on GitHub
Macro-from-Micro Planning for High-Quality and Parallelized Autoregressive Long Video Generation
☆40Oct 31, 2025Updated 8 months ago
MingXiangL / Teacache-xDiT
View on GitHub
Combining Teacache with xDiT to Accelerate Visual Generation Models
☆33Apr 21, 2025Updated last year
CentML / Mist
View on GitHub
[EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization
☆24Apr 13, 2026Updated 3 months ago
HuangShiqing / memory_viz_plus
View on GitHub
☆18Jun 14, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
shiyi-zh0408 / Meta-CoT
View on GitHub
[CVPR 2026] Official code of the paper "Meta-CoT: Enhancing Granularity and Generalization in Image Editing"
☆79May 6, 2026Updated 2 months ago
TIGER-AI-Lab / Context-Forcing
View on GitHub
Context Forcing: Consistent Autoregressive Video Generation with Long Context [ICML26]
☆106Jun 29, 2026Updated 3 weeks ago
Shi-qingyu / RecTok
View on GitHub
[CVPR 26] Official PyTorch Implementation of RecTok
☆23Feb 24, 2026Updated 5 months ago
KlingAIResearch / VMoBA
View on GitHub
Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"
☆64Jul 1, 2025Updated last year
NJU-PCALab / OpenVid-1M
View on GitHub
[ICLR 2025] OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
☆452May 30, 2025Updated last year
wangf3014 / VTok
View on GitHub
Official implementation of VTok: A Unified Video Tokenizer with Decoupled Spatial-Temporal Latents
☆15Feb 5, 2026Updated 5 months ago
antgroup / DeepXTrace
View on GitHub
DeepXTrace is a lightweight tool for precisely diagnosing slow ranks in DeepEP-based environments.
☆101Jan 16, 2026Updated 6 months ago
svg-project / Sparse-VideoGen
View on GitHub
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
☆697Jul 4, 2026Updated 3 weeks ago
viiika / HumanEdit
View on GitHub
[CVPR 2025 AI4CC Workshop] Official Implementation of HumanEdit: A High-Quality Human-Rewarded Dataset for Instruction-based Image Editin…
☆36May 8, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Yaofang-Liu / Mochi-Full-Finetuner
View on GitHub
Code for full fintuing Mochi model with FSDP (and CP)
☆29Jul 15, 2025Updated last year
Ephemeral182 / Empirical-Study-of-GPT-4o-Image-Gen
View on GitHub
An Empirical Study of GPT-4o Image Generation Capabilities
☆29Apr 16, 2025Updated last year
wusize / Harmon
View on GitHub
[ICCV2025]Code Release of Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
☆192May 21, 2025Updated last year
ToyotaResearchInstitute / gradient-estimation-sampler
View on GitHub
Code for the paper "Interpreting and Improving Diffusion Models from an Optimization Perspective", appearing in ICML 2024
☆15Sep 30, 2024Updated last year
furiosa-ai / uncage
View on GitHub
UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation
☆17Aug 12, 2025Updated 11 months ago
GoatWu / CausVid-Plus
View on GitHub
Unofficial extension implementation of CausVid
☆79Apr 28, 2025Updated last year
Junchao-cs / LIVE
View on GitHub
[ICML 2026] "LIVE: Long-horizon Interactive Video World ModEling"
☆35Jul 15, 2026Updated 2 weeks ago
RiseAI-Sys / DAX
View on GitHub
High performance inference engine for diffusion models
☆107Sep 5, 2025Updated 10 months ago
alibaba-damo-academy / Inferix
View on GitHub
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation
☆133Apr 28, 2026Updated 3 months ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
KlingAIResearch / MemFlow
View on GitHub
Official Implementation of "MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives"
☆216Dec 29, 2025Updated 7 months ago
tang-bd / fuse-dit
View on GitHub
[CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
☆140May 16, 2025Updated last year
Xinjie-Q / Distributed-DVC
View on GitHub
🏠[ICME 2023] Low-complexity Deep Video Compression with A Distributed Coding Architecture
☆36May 29, 2023Updated 3 years ago
zijieli-Jlee / Dual-Diffusion
View on GitHub
Code for D-DiT
☆69Apr 1, 2025Updated last year
Baichenjia / COPO
View on GitHub
Online Preference Alignment for Language Models via Count-based Exploration
☆21Jan 14, 2025Updated last year
CIntellifusion / VideoDPO
View on GitHub
Official Implementation of VideoDPO
☆169Jun 1, 2025Updated last year
blitz-serving / blitz-scale
View on GitHub
The official implementation of OSDI'25 paper BlitzScale
☆48Apr 15, 2026Updated 3 months ago