xiangyu-mm/EasyGen

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/xiangyu-mm/EasyGen)

xiangyu-mm / EasyGen

The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"

☆73

Alternatives and similar repositories for EasyGen

Users that are interested in EasyGen are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Jyonn / RecBench
View on GitHub
Benchmarking Recommendation Abilities for Large Language Models
☆36Mar 10, 2026Updated 4 months ago
icoz69 / StableLLAVA
View on GitHub
Official repo for StableLLAVA
☆94Dec 22, 2023Updated 2 years ago
LUOyk1999 / NodeID
View on GitHub
[ICLR 2025] Implementation of "Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning"
☆17Jun 6, 2025Updated last year
HYPJUDY / Sparkles
View on GitHub
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models
☆46Jun 14, 2024Updated 2 years ago
sejoonoh / ATR
View on GitHub
Code and data for the ACM CIKM 2024 paper "Adversarial Text Rewriting for Text-aware Recommender Systems"
☆12Aug 1, 2024Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
kohjingyu / gill
View on GitHub
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".
☆470Jan 19, 2024Updated 2 years ago
mightyzau / RegionBLIP
View on GitHub
☆59Aug 7, 2023Updated 2 years ago
AILab-CVC / SEED
View on GitHub
Official implementation of SEED-LLaMA (ICLR 2024).
☆642Sep 21, 2024Updated last year
liujianzhi / EchoReel
View on GitHub
An innovative method designed to augment the capabilities of existing video diffusion models
☆22May 10, 2024Updated 2 years ago
YujieLu10 / TIP
View on GitHub
Multimodal-Procedural-Planning
☆92Jun 1, 2023Updated 3 years ago
weijiawu / ParaDiffusion
View on GitHub
[IJCV 2025] Paragraph-to-Image Generation with Information-Enriched Diffusion Model
☆107Mar 24, 2025Updated last year
koutilya-pnvr / LD-ZNet
View on GitHub
☆33Oct 27, 2025Updated 9 months ago
JiwanChung / vlis
View on GitHub
☆24Oct 9, 2023Updated 2 years ago
kyegomez / KosmosG
View on GitHub
My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"
☆13Nov 11, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ubc-vision / Make-A-Story
View on GitHub
Code Release for the paper "Make-A-Story: Visual Memory Conditioned Consistent Story Generation" in CVPR 2023
☆43Jun 27, 2023Updated 3 years ago
nupurkmr9 / concept-ablation
View on GitHub
Ablating Concepts in Text-to-Image Diffusion Models (ICCV 2023)
☆171May 24, 2026Updated 2 months ago
kodenii / ORES
View on GitHub
ORES: Open-vocabulary Responsible Visual Synthesis
☆14Dec 12, 2023Updated 2 years ago
kyegomez / SelfExtend
View on GitHub
Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta
☆13Nov 11, 2024Updated last year
thu-ml / unidiffuser
View on GitHub
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
☆1,485May 31, 2023Updated 3 years ago
jamessealesmith / ConStruct-VL
View on GitHub
PyTorch code for the CVPR'23 paper: "ConStruct-VL: Data-Free Continual Structured VL Concepts Learning"
☆13Feb 5, 2024Updated 2 years ago
ZYM-PKU / UDiffText
View on GitHub
[ECCV 2024] Official repo for UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diff…
☆236Feb 14, 2025Updated last year
yuezih / SMILE
View on GitHub
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation (NeurIPS 2023)
☆23Oct 1, 2023Updated 2 years ago
baaivision / Emu
View on GitHub
Emu Series: Generative Multimodal Models from BAAI
☆1,776Jan 12, 2026Updated 6 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
tsb0601 / MultiMon
View on GitHub
☆25Jun 22, 2023Updated 3 years ago
JIA-Lab-research / Prompt-Highlighter
View on GitHub
[CVPR 2024] Prompt Highlighter: Interactive Control for Multi-Modal LLMs
☆159Jul 23, 2024Updated 2 years ago
jy0205 / LaVIT
View on GitHub
LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
☆603Oct 6, 2024Updated last year
yfyuan01 / MultiturnFashionRetrieval
View on GitHub
SIGIR paper Conversational Fashion Image Retrieval via Multiturn Natural Language Feedback
☆14Oct 17, 2022Updated 3 years ago
TencentARC / TaCA
View on GitHub
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
☆16Jun 20, 2023Updated 3 years ago
YiyangZhou / LURE
View on GitHub
[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models
☆158Apr 30, 2024Updated 2 years ago
LUOyk1999 / tunedGNN
View on GitHub
[NeurIPS 2024] Implementation of "Classic GNNs are Strong Baselines: Reassessing GNNs for Node Classification"
☆187Jun 5, 2025Updated last year
omipan / svl_adapter
View on GitHub
SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models
☆21Jan 11, 2024Updated 2 years ago
Zhendong-Wang / Prompt-Diffusion
View on GitHub
Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"
☆414Mar 25, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
ash-neupane / multi-token-pred
View on GitHub
Train toy models using multi-token prediction objective
☆14Apr 18, 2026Updated 3 months ago
AtsuMiyai / rethinking_rotation
View on GitHub
[WACV2023] This is the official PyTorch impelementation of our paper "[Rethinking Rotation in Self-Supervised Contrastive Learning: Adapt…
☆12Feb 24, 2023Updated 3 years ago
xbmxb / CoCo-Agent
View on GitHub
☆35Jun 20, 2024Updated 2 years ago
univ-esuty / ambifusion
View on GitHub
Official repository for the paper ''ambigram generation by a diffusion model''.
☆17Aug 9, 2023Updated 2 years ago
HKUST-LongGroup / CoMM
View on GitHub
[CVPR 2025 Highlight] Official repository for CoMM Dataset
☆56Dec 31, 2024Updated last year
McGill-NLP / diffusion-itm
View on GitHub
Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"
☆33Mar 15, 2024Updated 2 years ago
claws-lab / multimodal-robustness
View on GitHub
Code and resources for EMNLP 2022 paper on 'Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions'
☆10Mar 11, 2024Updated 2 years ago