[Preprint] Efficient Generative Model Training via Embedded Representation Warmup
β36Oct 15, 2025Updated 5 months ago
Alternatives and similar repositories for ERW
Users that are interested in ERW are comparing it to the libraries listed below
Sorting:
- π Research-focused SDXL training framework exploring novel optimization approaches. Goals include enhanced image quality, training stabiβ¦β21Jun 7, 2025Updated 9 months ago
- β23Nov 26, 2024Updated last year
- Official implementation for the AAAI2025 paper "PIXELS - Progressive Image Xemplar-based Editing with Latent Surgery"β11Dec 17, 2024Updated last year
- β30Mar 4, 2025Updated last year
- [AAAI 2026] Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devicesβ95Nov 30, 2025Updated 3 months ago
- β26Mar 4, 2025Updated last year
- Official Implementation of wd1β24Sep 25, 2025Updated 5 months ago
- [ICLR 2026] Code for "gen2seg: Generative Models Enable Generalizable Instance Segmentation"β68Feb 9, 2026Updated last month
- This repository includes the official implementation of our paper "Grouping First, Attending Smartly: Training-Free Acceleration for Diffβ¦β55May 21, 2025Updated 10 months ago
- β19Jun 4, 2025Updated 9 months ago
- [ICCV 2025] Official Implementation of Contrastive Flow Matchingβ166Jun 25, 2025Updated 8 months ago
- StableWorld: Towards Stable and Consistent Long Interactive Video Generationβ86Updated this week
- [ICML'25] EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.β175Updated this week
- Nitro-T is a family of text-to-image diffusion models focused on highly efficient training.β40Jul 10, 2025Updated 8 months ago
- β23Jul 20, 2025Updated 8 months ago
- β30Jan 18, 2026Updated 2 months ago
- DMM: Building a Versatile Image Generation Model via Distillation-Based Model Mergingβ47Apr 27, 2025Updated 10 months ago
- Official implementation of paper "VMoBA: Mixture-of-Block Attention for Video Diffusion Models"β62Jul 1, 2025Updated 8 months ago
- Code, Resources - Personal project - Llama Paper Summary - October 14, 2024.β11Oct 15, 2024Updated last year
- β65May 3, 2025Updated 10 months ago
- This is a repository for paper titled, PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Plannβ¦β14Nov 3, 2023Updated 2 years ago
- [CVPR 2025 GMCV] Test-Time Frequency Scaling: Instant Frequency Control for Any Diffusion Modelβ55May 31, 2025Updated 9 months ago
- official implementation of the paper "Delving into Latent Spectral Biasing of Video VAEs for Superior Diffusability".β51Dec 25, 2025Updated 2 months ago
- This repo contains the official implementation of the ICLR 2026 paper "Attention, Please! Revisiting Attentive Probing Through the Lens oβ¦β30Feb 23, 2026Updated 3 weeks ago
- Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models (ICLR 2026)β45Mar 3, 2026Updated 2 weeks ago
- β110Sep 3, 2025Updated 6 months ago
- β19Aug 19, 2024Updated last year
- [ICLR 2026] Lumos Project: Frontier video unified model research by Alibaba DAMO Academy.β154Updated this week
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".β30Nov 12, 2024Updated last year
- Description for MV-MATHβ15Jul 20, 2025Updated 8 months ago
- [NeurIPS'25 Spotlight] Boosting Generative Image Modeling via Joint Image-Feature Synthesisβ117Nov 3, 2025Updated 4 months ago
- [TPAMI 2026] Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generationβ11Mar 7, 2026Updated 2 weeks ago
- The first comprehensive multimodal language analysis benchmark for evaluating foundation modelsβ29Sep 22, 2025Updated 6 months ago
- A comprehensive framework for benchmarking single and multi-agent systems across a wide range of tasksβevaluating performance, accuracy, β¦β36Nov 11, 2025Updated 4 months ago
- This repo contains evaluation code for the paper "AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?"β31Dec 23, 2024Updated last year
- Transforming Video Diffusion with Temporal Sparse Attentionβ46Updated this week
- Quick Long Video Understanding [TMLR2025]β76Oct 27, 2025Updated 4 months ago
- Code for "APTBench: Benchmarking Agentic Potential of Base LLMs During Pre-Training"β39Dec 23, 2025Updated 2 months ago
- DanceTogether! Identity-Preserving Multi-Person Interactive Video Generationβ39Aug 3, 2025Updated 7 months ago