Finetune your VAE on private datasets!
☆37Jun 20, 2024Updated last year
Alternatives and similar repositories for Train_SD_VAE
Users that are interested in Train_SD_VAE are comparing it to the libraries listed below
Sorting:
- This repository contains the code for training FLUX.1-Kontext-dev, a powerful image editing model.☆33Jul 18, 2025Updated 7 months ago
- Official implementation of "Towards One-Step Causal Video Generation via Adversarial Self-Distillation" (arXiv 2025). A novel framework f…☆25Nov 4, 2025Updated 4 months ago
- ☆56Updated this week
- Official implementations for paper: PS-Diffusion: Photorealistic Subject-Driven Image Editing with Disentangled Control and Attention☆19Oct 20, 2025Updated 4 months ago
- ☆53Dec 10, 2025Updated 3 months ago
- ☆23Oct 15, 2024Updated last year
- [ECCV 2024] 3DPE: Real-time 3D-aware Portrait Editing from a Single Image☆22Sep 15, 2025Updated 5 months ago
- ☆27Aug 14, 2024Updated last year
- Visualization of DiT self attention features☆236Aug 12, 2024Updated last year
- Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing☆72Jul 13, 2025Updated 7 months ago
- [AAAI 2026] Few-step Flow for 3D Generation via Marginal-Data Transport Distillation☆50Jan 9, 2026Updated 2 months ago
- ☆21Dec 14, 2025Updated 2 months ago
- This is the official implementation of our Señorita-2M [Weights and Dataset] : A High-Quality Instruction-based Dataset for General Video…☆105Apr 9, 2025Updated 11 months ago
- A collection of diffusion models based on FLUX/DiT for image/video generation, editing, reconstruction, inpainting .etc.☆85Jun 20, 2025Updated 8 months ago
- Official repository for CVPR 2025 paper: OpenSDI: Spotting Diffusion-Generated Images in the Open World☆41Jul 8, 2025Updated 8 months ago
- [CVPR 2025] PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation☆47Jul 1, 2025Updated 8 months ago
- Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization☆62Sep 19, 2025Updated 5 months ago
- This is the official training code of OmniSVG☆30Jan 19, 2026Updated last month
- A free and open-source focus stacking software that supports multi-focus image alignment and fusion.☆20Feb 5, 2026Updated last month
- [ICLR 2026] GenCompositor: Generative Video Compositing with Diffusion Transformer☆151Feb 18, 2026Updated 2 weeks ago
- [ICCV 2025 Workshop Outstanding Paper Award] VChain: Chain-of-Visual-Thought for Reasoning in Video Generation☆115Oct 7, 2025Updated 5 months ago
- [CVPR2026] ODTSR: This repo is the official implementation of "One-Step Diffusion Transformer for Controllable Real-World Image Super-Res…☆104Feb 21, 2026Updated 2 weeks ago
- Unlocking Iterative Reasoning for Any Image Editor☆98Jan 18, 2026Updated last month
- Official PyTorch implementation of "A Unified Approach for Text- and Image-guided 4D Scene Generation", [CVPR 2024]☆93Apr 23, 2024Updated last year
- Benchmark dataset and code of MSRVTT-Personalization☆52Nov 10, 2025Updated 3 months ago
- Finetuning and inference tools for the CogView4 and CogVideoX model series.☆118May 14, 2025Updated 9 months ago
- [ISBI 2024] Official PyTorch implementation of Towards Cross-Domain Single Blood Cell Image Classification via Large-Scale LoRA-based Seg…☆11Aug 12, 2024Updated last year
- ☆43Dec 1, 2025Updated 3 months ago
- Projects Pages for the NJU-3DV's Reserach Work☆10Jan 26, 2026Updated last month
- [ICLR 2026] SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models☆75Jan 29, 2026Updated last month
- ☆20Sep 5, 2025Updated 6 months ago
- We Need No Pixels: Video Manipulation Detection Using Stream Descriptors☆10Oct 4, 2019Updated 6 years ago
- STDFormer: Spatio Temporal Disentanglement Learning for 3D Human Mesh Recovery from Monocular Videos with Transformer☆45Mar 14, 2024Updated last year
- This is the official repository for the paper "Modeling Human Gaze Behavior with Diffusion Models for Unified Scanpath Prediction". ICCV …☆24Dec 4, 2025Updated 3 months ago
- [CVPR 2026] Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO☆96Feb 28, 2026Updated last week
- ☆109Nov 27, 2024Updated last year
- Official code for paper: F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Aggregative Gaussian Splatting☆50Mar 11, 2025Updated 11 months ago
- The official implementation of StereoPilot☆102Dec 19, 2025Updated 2 months ago
- Official implement of MoA-VR: A Mixture-of-Agents System Towards All-in-One Video Restoration☆22Nov 20, 2025Updated 3 months ago