xiangyu-mm / EasyGenView external linksLinks
The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"
☆73Nov 21, 2024Updated last year
Alternatives and similar repositories for EasyGen
Users that are interested in EasyGen are comparing it to the libraries listed below
Sorting:
- Official repo for StableLLAVA☆95Dec 22, 2023Updated 2 years ago
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Jun 14, 2024Updated last year
- Unofficial fork of the Code used in the paper "Discriminative Unsupervised Feature Learning with Convolutional Neural Networks", NIPS 201…☆16Mar 28, 2017Updated 8 years ago
- [IJCV 2025] Paragraph-to-Image Generation with Information-Enriched Diffusion Model☆106Mar 24, 2025Updated 10 months ago
- 🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".☆471Jan 19, 2024Updated 2 years ago
- ☆58Aug 7, 2023Updated 2 years ago
- ☆25Jun 22, 2023Updated 2 years ago
- Official implementation of SEED-LLaMA (ICLR 2024).☆639Sep 21, 2024Updated last year
- Code Release for the paper "Make-A-Story: Visual Memory Conditioned Consistent Story Generation" in CVPR 2023☆43Jun 27, 2023Updated 2 years ago
- Ablating Concepts in Text-to-Image Diffusion Models (ICCV 2023)☆167Dec 21, 2024Updated last year
- [WACV 2024] Training-Free Layout Control with Cross-Attention Guidance☆266Mar 18, 2024Updated last year
- SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models☆21Jan 11, 2024Updated 2 years ago
- [ICCV 2023] ViLLA: Fine-grained vision-language representation learning from real-world data☆46Oct 15, 2023Updated 2 years ago
- ☆24Oct 9, 2023Updated 2 years ago
- [Findings of ACL-2023] This is the official implementation of On the Difference of BERT-style and CLIP-style Text Encoders.☆14Jun 7, 2023Updated 2 years ago
- Code and resources for EMNLP 2022 paper on 'Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions'☆10Mar 11, 2024Updated last year
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"☆14Nov 11, 2024Updated last year
- Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"☆413Mar 25, 2024Updated last year
- Official Implementation of the paper: A Complete Recipe for Diffusion Generative Models☆31Nov 1, 2024Updated last year
- Official repository for CoMM Dataset☆49Dec 31, 2024Updated last year
- [ECCV 2024] Official repo for UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diff…☆234Feb 14, 2025Updated 11 months ago
- Code for "DreamEdit: Subject-driven Image Editing" (TMLR2023)☆109Jan 23, 2024Updated 2 years ago
- Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models, 2023☆136Oct 22, 2025Updated 3 months ago
- PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.☆441May 14, 2024Updated last year
- PyTorch code for the CVPR'23 paper: "ConStruct-VL: Data-Free Continual Structured VL Concepts Learning"☆14Feb 5, 2024Updated 2 years ago
- A large scale dataset for Video Captioning in Italian☆13May 16, 2023Updated 2 years ago
- A collection of AI-generated images papers and corresponding source code/demo program, including text-to-image, image translation (e.g., …☆13Nov 21, 2023Updated 2 years ago
- ☆13Mar 11, 2018Updated 7 years ago
- TIER: Text-Image Encoder-based Regression for AIGC Image Quality Assessment☆10Mar 1, 2025Updated 11 months ago
- Collections of papers and code for employing MLLM for quality assessment tasks.☆13Apr 18, 2024Updated last year
- Code and data for the ACM CIKM 2024 paper "Adversarial Text Rewriting for Text-aware Recommender Systems"☆12Aug 1, 2024Updated last year
- ☆11Sep 15, 2023Updated 2 years ago
- [ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models☆155Apr 30, 2024Updated last year
- Training code for CLIP-FlanT5☆30Jul 29, 2024Updated last year
- Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"☆33Mar 15, 2024Updated last year
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Dec 12, 2023Updated 2 years ago
- Karras et al. (2022) diffusion models for PyTorch☆17Oct 5, 2023Updated 2 years ago
- SIGIR paper Conversational Fashion Image Retrieval via Multiturn Natural Language Feedback☆14Oct 17, 2022Updated 3 years ago
- [ECCV2022] A PyTorch implementation of the paper "Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embo…☆13Mar 20, 2023Updated 2 years ago