YasminZhang / EBAMA
[ECCV 2024] Official repository of ECCV 2024 paper: Object-Conditioned Energy-Based Attention Map Alignment in Text-to-Image Diffusion Models
☆9Updated last month
Related projects ⓘ
Alternatives and complementary repositories for EBAMA
- 🔥ImageFolder: Autoregressive Image Generation with Folded Tokens☆53Updated 3 weeks ago
- 🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook☆40Updated 4 months ago
- ☆30Updated 2 weeks ago
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Atten…☆24Updated this week
- This is a repo to track the latest autoregressive visual generation papers.☆43Updated last month
- Codebase for the paper-Elucidating the design space of language models for image generation☆29Updated this week
- A PyTorch implementation of the paper "MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis".☆12Updated last year
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆48Updated last week
- (NeurIPS 2024 Spotlight) TOPA: Extend Large Language Models for Video Understanding via Text-Only Pre-Alignment☆18Updated last month
- [ECCV 2024 Oral] ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction☆44Updated 3 months ago
- The collection of awesome papers on alignment of diffusion models.☆45Updated last week
- The paper collections for the autoregressive models in vision.☆101Updated this week
- ☆31Updated last month
- Implementation of Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding☆22Updated last week
- Benchmarking and Analyzing Generative Data for Visual Recognition☆26Updated last year
- Official code for paper: Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language☆21Updated 4 months ago
- ☆21Updated 3 months ago
- Official implementation of the CVPR'24 paper [Adaptive Slot Attention: Object Discovery with Dynamic Slot Number]☆22Updated last week
- The official implementation for "MonoFormer: One Transformer for Both Diffusion and Autoregression"☆76Updated last month
- [ICLR 2024] Official pytorch implementation of "Denoising Task Routing for Diffusion Models"☆19Updated 8 months ago
- [ECCV2024] Learning Video Context as Interleaved Multimodal Sequences☆29Updated last month
- ☆22Updated 6 months ago
- ICCV2023-Diffusion-Papers☆110Updated last year
- Official GitHub repository for the Text-Guided Video Editing (TGVE) competition of LOVEU Workshop @ CVPR'23.☆72Updated last year
- [CVPR 2024] The repository contains the official implementation of "Open-Vocabulary Segmentation with Semantic-Assisted Calibration"☆59Updated last month
- T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation☆46Updated 2 months ago
- Replication in Visual Diffusion Models: A Survey and Outlook☆22Updated 3 months ago
- Empowering Unified MLLM with Multi-granular Visual Generation☆104Updated 3 weeks ago
- ☆57Updated last year
- Official Implementation of ICLR'24: Kosmos-G: Generating Images in Context with Multimodal Large Language Models☆50Updated 5 months ago