RobinWu218 / ToST
[ICLR 2025 Spotlight] Official Implementation for ToST (Token Statistics Transformer)
☆78Updated last month
Alternatives and similar repositories for ToST:
Users that are interested in ToST are comparing it to the libraries listed below
- Official PyTorch Implementation of "Diffusion Autoencoders are Scalable Image Tokenizers"☆109Updated 2 months ago
- Implementation of the proposed MaskBit from Bytedance AI☆75Updated 4 months ago
- [ICLR 2025] Official PyTorch implementation of "Forgetting Transformer: Softmax Attention with a Forget Gate"☆80Updated this week
- Official PyTorch Implementation of "Scalable Autoregressive Image Generation with Mamba"☆127Updated 2 months ago
- Triton implement of bi-directional (non-causal) linear attention☆44Updated last month
- Implementation of a multimodal diffusion transformer in Pytorch☆101Updated 9 months ago
- [ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule☆143Updated last week
- The official implementation of OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows☆57Updated 2 weeks ago
- Inference-only implementation of "One-Step Diffusion Distillation through Score Implicit Matching" [NIPS 2024]☆77Updated 4 months ago
- The codebase of our paper "Improving the Training of Rectified Flows", NeurIPS 2024☆106Updated 5 months ago
- TerDiT: Ternary Diffusion Models with Transformers☆69Updated 9 months ago
- Scaling Diffusion Transformers with Mixture of Experts☆300Updated 6 months ago
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆170Updated 9 months ago
- FORA introduces simple yet effective caching mechanism in Diffusion Transformer Architecture for faster inference sampling.☆41Updated 8 months ago
- [NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching☆98Updated 8 months ago
- Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch☆81Updated last month
- ☆140Updated this week
- ☆68Updated 4 months ago
- ☆147Updated 3 months ago
- Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"☆297Updated 3 months ago
- Implementation of SmoothCache, a project aimed at speeding-up Diffusion Transformer (DiT) based GenAI models with error-guided caching.☆41Updated last week
- Scaling RWKV-Like Architectures for Diffusion Models☆126Updated 11 months ago
- This repository includes the official implementation of our paper "Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generat…☆152Updated 3 weeks ago
- The official repo of continuous speculative decoding☆25Updated this week
- A general framework for inference-time scaling and steering of diffusion models with arbitrary rewards.☆112Updated last month
- Adaptive Caching for Faster Video Generation with Diffusion Transformers☆142Updated 4 months ago
- ☆65Updated last month
- [NeurIPS 2024] Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective☆66Updated 4 months ago
- HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆51Updated last month
- Official PyTorch implementation of TokenSet.☆104Updated last week