Official implementation of Add-SD: Rational Generation without Manual Reference.
☆28Aug 19, 2024Updated last year
Alternatives and similar repositories for Add-SD
Users that are interested in Add-SD are comparing it to the libraries listed below
Sorting:
- Neural network for creating distortion while keeping embeddings as close as possible☆20Feb 6, 2024Updated 2 years ago
- Code release for AccDiffusion (ECCV 2024)☆93Nov 19, 2024Updated last year
- Official PyTorch implementation of The Linear Attention Resurrection in Vision Transformer☆16Sep 7, 2024Updated last year
- Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model (SIGGRAPH 2024)☆38Sep 10, 2024Updated last year
- [NeurIPS 2024] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis☆86Feb 3, 2025Updated last year
- "Visual Prompt Selection for In-Context Learning Segmentation Framework"☆15Dec 13, 2024Updated last year
- Video Diffusion State Space Models☆19Mar 27, 2024Updated last year
- SAM4SS: Tailoring SAM and SAM2 for Semantic Segmentation☆11Jul 31, 2024Updated last year
- [ICCV 2025] Official implementation of "InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models"☆53Feb 10, 2025Updated last year
- TPDiff: Temporal Pyramid Video Diffusion Model☆25Mar 13, 2025Updated last year
- Code for "Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured …☆17Oct 15, 2023Updated 2 years ago
- EVA: Zero-shot Accurate Attributes and Multi-Object Video Editing☆30Mar 29, 2024Updated last year
- [NeurIPS 2024] EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.☆52Oct 14, 2024Updated last year
- (CVPR 2025) Scailing Down Text Encoders of Text-to-Image Diffusion Models☆52Sep 10, 2025Updated 6 months ago
- [NeurIPS 2024] OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling.☆31Nov 13, 2025Updated 4 months ago
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"☆14Nov 11, 2024Updated last year
- An innovative method designed to augment the capabilities of existing video diffusion models☆22May 10, 2024Updated last year
- [ECCV-24] This is the official implementation of the paper "SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation".☆27Oct 13, 2024Updated last year
- ☆15Jul 13, 2023Updated 2 years ago
- [ICCV 2025] MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation☆21Sep 5, 2025Updated 6 months ago
- [ECCV2024]FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance☆17Sep 11, 2024Updated last year
- ConceptsDreambooth☆19Nov 30, 2022Updated 3 years ago
- PyTorch implementation of InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following☆32Jan 24, 2025Updated last year
- ☆42May 15, 2025Updated 10 months ago
- Official implementation of UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified …☆73Nov 29, 2024Updated last year
- ☆28Jul 22, 2024Updated last year
- Video Diffusion Transformers are In-Context Learners☆35Jan 6, 2025Updated last year
- Inference code for DWCode☆35Oct 24, 2023Updated 2 years ago
- ☆18Oct 23, 2024Updated last year
- [CVPR'25] MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models☆62May 27, 2025Updated 9 months ago
- [ICCV 2025 Highlight] Panorama Generation as a Next-Token Prediction Task.☆48Oct 29, 2025Updated 4 months ago
- ☆16Apr 4, 2025Updated 11 months ago
- Code for "The Unreasonable Effectiveness of Linear Prediction as a Perceptual Metric"☆23Jan 26, 2024Updated 2 years ago
- CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method☆27Oct 9, 2025Updated 5 months ago
- [ICLR 2024] Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement.☆15Mar 12, 2024Updated 2 years ago
- ☆27Mar 3, 2025Updated last year
- DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection☆21Oct 5, 2023Updated 2 years ago
- This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentati…☆72Jun 3, 2024Updated last year
- Evaluating language models on word puzzle games☆10Oct 25, 2024Updated last year