[ICCV 2025] Enhancing spatial understanding in text-to-Image diffusion models
β90Sep 11, 2025Updated 5 months ago
Alternatives and similar repositories for CoMPaSS
Users that are interested in CoMPaSS are comparing it to the libraries listed below
Sorting:
- Resilient multi-LLM orchestration with in-built failure handing, rate limits, retries, and circuit breaker.β29Updated this week
- [CVPR 2026] π Dataset and Benchmark code for EgoEditβ106Feb 21, 2026Updated last week
- Pose Extraction & Rendering for SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representatβ¦β180Dec 28, 2025Updated 2 months ago
- β86Feb 4, 2026Updated 3 weeks ago
- Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models (ICLR 2026)β42Feb 18, 2026Updated last week
- End2End Virtual Try-on with Visual Reference, CVPR2026β58Nov 19, 2025Updated 3 months ago
- Official repository for the paper "MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars"β41Nov 20, 2025Updated 3 months ago
- β10Jan 23, 2025Updated last year
- [3DV 2026 Oral] VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Spaceβ211Nov 25, 2025Updated 3 months ago
- Pusa: Thousands Timesteps Video Diffusion Modelβ671Feb 13, 2026Updated 2 weeks ago
- [ICCV 2025] Code & Data for: SuperEdit - Rectifying and Facilitating Supervision for Instruction-Based Image Editingβ164Jun 26, 2025Updated 8 months ago
- [ICCV 2025] Official implementation of the paper "DreamCube: 3D Panorama Generation via Multi-plane Synchronization".β172Feb 4, 2026Updated 3 weeks ago
- DanceTogether! Identity-Preserving Multi-Person Interactive Video Generationβ39Aug 3, 2025Updated 6 months ago
- [CVPR2026 π] Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.β731Feb 21, 2026Updated last week
- Official PyTorch implementation of the paper "FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing"β78Dec 12, 2025Updated 2 months ago
- Official inference code and LongText-Bench benchmark for our paper X-Omni (https://arxiv.org/pdf/2507.22058).β420Aug 26, 2025Updated 6 months ago
- Feed-forward model for predicting 3D physics with 3DGS + NeRFβ269Sep 1, 2025Updated 6 months ago
- Official Implementation of DRA-Ctrl (Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis)β118Aug 15, 2025Updated 6 months ago
- [CVPR 2026] π₯π₯ Official Repo of USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learningβ1,207Sep 12, 2025Updated 5 months ago
- Scaling Zero-Shot Reference-to-Video Generationβ62Dec 11, 2025Updated 2 months ago
- [ICCV 2025] Edicho: Consistent Image Editing in the Wildβ124Oct 22, 2025Updated 4 months ago
- [SIGGRAGH'25] Official repository of LayerFlow: A Unified Model for Layer-aware Video Generationβ86Aug 18, 2025Updated 6 months ago
- Following the advance of AIGCβ23Oct 28, 2025Updated 4 months ago
- [ICLR'25] Official repository of paper: Ranking-aware adapter for text-driven image ordering with CLIPβ16Apr 17, 2025Updated 10 months ago
- [ICLR 2026] Streamlining Cartoon Production with Generative Post-Keyframingβ546Aug 20, 2025Updated 6 months ago
- [arXiv] On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devicesβ133Nov 27, 2025Updated 3 months ago
- β184Jul 31, 2025Updated 7 months ago
- Official implementation for "Story2Board: A TrainingβFree Approach for Expressive Storyboard Generation"β233Aug 22, 2025Updated 6 months ago
- [CVPR 2026] OmniTransfer: All-in-one Framework for Spatio-temporal Video Transferβ219Feb 21, 2026Updated last week
- β23Jul 20, 2025Updated 7 months ago
- π΅ Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"β24Dec 14, 2025Updated 2 months ago
- [inactive] MoMA: Multimodal LLM Adapter for Fast Personalized Image Generationβ13Apr 22, 2024Updated last year
- Enhance-A-Video: Better Generated Video for Freeβ594Mar 17, 2025Updated 11 months ago
- [ICCV2025] Official implementation of "IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation".β61Jun 27, 2025Updated 8 months ago
- Implementation code of the paper MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editingβ72Jul 13, 2025Updated 7 months ago
- MegaRAG: Multimodal Graph-based RAGβ36Sep 16, 2025Updated 5 months ago
- SynCD: Generating Multi-Image Synthetic Data for Text-to-Image Customization (ICCV 2025)β153Oct 16, 2025Updated 4 months ago
- β25Jun 18, 2025Updated 8 months ago
- lite attention implemented over flash attention 3β45Updated this week