Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
☆227Feb 13, 2026Updated last month
Alternatives and similar repositories for Scale-RAE
Users that are interested in Scale-RAE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆41Oct 29, 2025Updated 4 months ago
- Code and website for Self-Flow: Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis☆378Mar 15, 2026Updated last week
- ☆12Jul 18, 2024Updated last year
- ECCV2024, LAPT: Label-driven Automated Prompt Tuning for OOD Detection with Vision-Language Models☆18Aug 9, 2024Updated last year
- Official PyTorch Implementation of "Latent Denoising Makes Good Visual Tokenizers"☆180Feb 24, 2026Updated last month
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- [CVPR 2025] Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis☆131May 16, 2025Updated 10 months ago
- RePlan: Reasoning-Guided Region Planning for Complex Instruction-Based Image Editing☆59Mar 19, 2026Updated last week
- LatentMorph: Morphing Latent Reasoning into Image Generation☆37Feb 3, 2026Updated last month
- [ICLR 2026] PixNerd: Pixel Neural Field Diffusion☆174Dec 10, 2025Updated 3 months ago
- [ICLR 2026] UniEdit-Flow: Unleashing Inversion and Editing in the Era of Flow Models☆43Aug 4, 2025Updated 7 months ago
- This repo contains the code for 1D tokenizer and generator☆1,134Mar 20, 2025Updated last year
- Official code for "Rethinking Chain-of-Thought Reasoning for Videos"☆20Dec 14, 2025Updated 3 months ago
- Code release for "PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop" (ICML 2025)☆54May 8, 2025Updated 10 months ago
- [ICLR 2026] OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling☆444Feb 25, 2026Updated last month
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- [CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models☆1,418Dec 16, 2025Updated 3 months ago
- Code for MetaMorph Multimodal Understanding and Generation via Instruction Tuning☆235Jan 22, 2026Updated 2 months ago
- Official PyTorch codes for "Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation", ECCV2024☆30Jul 19, 2024Updated last year
- ☆15Nov 11, 2024Updated last year
- SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing☆19Dec 28, 2024Updated last year
- [ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think☆1,585Mar 16, 2025Updated last year
- Official Implementation of "UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation"☆139Oct 17, 2025Updated 5 months ago
- Adapting Self-Supervised Representations as a Latent Space for Efficient Generation☆40Oct 17, 2025Updated 5 months ago
- Official PyTorch implementation of paper “InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction”☆33Jul 28, 2025Updated 7 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- SEED-Voken: A Series of Powerful Visual Tokenizers☆999Nov 25, 2025Updated 4 months ago
- PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838☆1,883Feb 20, 2026Updated last month
- [ECCV'24] A novel weakly supervised framework for 3D object detection from 2D bounding boxes. It can easily extend to novel scenarios and…☆36Jul 26, 2024Updated last year
- Official Implementation of Paper Transfer between Modalities with MetaQueries☆310Oct 12, 2025Updated 5 months ago
- Pixio: a capable vision encoder dedicated to dense prediction, simply by pixel reconstruction☆361Jan 22, 2026Updated 2 months ago
- [Preprint] UCGM: Unified Continuous Generative Models☆183May 27, 2025Updated 10 months ago
- [NeurIPS 2024]OmniTokenizer: one model and one weight for image-video joint tokenization.☆323Jul 9, 2024Updated last year
- Official Implementation of pMF https://arxiv.org/abs/2601.22158☆193Feb 19, 2026Updated last month
- Make self forcing endless. Add cache purging. Add prompt controllability.☆70Sep 9, 2025Updated 6 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- code for "EMS: 3D Eyebrow Modeling from Single-view Images"(SIGGRAPH Asia 2023)☆13May 3, 2025Updated 10 months ago
- Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence☆305Mar 2, 2026Updated 3 weeks ago
- code for "TVG: A Training-free Transition Video Generation Method with Diffusion Models"☆49Aug 19, 2024Updated last year
- Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"☆1,816Feb 25, 2026Updated last month
- JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers☆17Jul 21, 2025Updated 8 months ago
- [🚀 ICLR 2026 Oral] NextStep-1: SOTA Autogressive Image Generation with Continuous Tokens. A research project developed by the StepFun’s …☆649Feb 27, 2026Updated last month
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"☆427Jun 20, 2025Updated 9 months ago