ali-vilab/IDEA-Bench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ali-vilab/IDEA-Bench)

ali-vilab / IDEA-Bench

Official repository of IDEA-Bench

☆41

Alternatives and similar repositories for IDEA-Bench

Users that are interested in IDEA-Bench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ali-vilab / ChatDiT
View on GitHub
☆53Dec 20, 2024Updated last year
WuTao-CS / VideoMaker
View on GitHub
This is the official implementation of VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Mode…
☆17Mar 4, 2025Updated last year
LesterGong / MMRB
View on GitHub
The official repository of paper "Evaluating MLLMs with Multimodal Multi-image Reasoning Benchmark"
☆19Jun 20, 2025Updated last year
redhottensors / ComfyUI-Prediction
View on GitHub
Fully customizable Classifer Free Guidance for ComfyUI
☆15Jul 14, 2024Updated 2 years ago
cagliostrolab / cagliostro-webui
View on GitHub
☆18Jan 10, 2024Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
River-Zhang / Awesome-FLUX-DiT
View on GitHub
A collection of diffusion models based on FLUX/DiT for image/video generation, editing, reconstruction, inpainting .etc.
☆86Jun 20, 2025Updated last year
raven38 / rf_inversion
View on GitHub
☆49Dec 25, 2024Updated last year
chenllliang / DreamEngine
View on GitHub
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!
☆123Mar 4, 2025Updated last year
PKU-YuanGroup / UniSandBox
View on GitHub
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward
☆60Nov 27, 2025Updated 7 months ago
Extraltodeus / Conditioning-token-experiments-for-ComfyUI
View on GitHub
A few experimental nodes about the conditioning and the next closest tokens. For ComfyUI.
☆19Mar 10, 2024Updated 2 years ago
jiahao-shao1 / openclaw-setup
View on GitHub
☆16Mar 8, 2026Updated 4 months ago
AconexOfficial / ComfyUI_GOAT_Nodes
View on GitHub
Nodes to level up your workflows performance and streamline specific functions.
☆11Aug 19, 2025Updated 11 months ago
songyiren98 / CLIPFont
View on GitHub
Implementation of paper: CLIPFont: Texture Guided Vector WordArt Generation
☆18Oct 8, 2022Updated 3 years ago
WhiteDOU / MetaGait
View on GitHub
Source code for ECCV 2022 paper: "MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition"
☆13Jul 19, 2022Updated 4 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
taolinzhang / 3DVLP
View on GitHub
[AAAI2024] An official pytorch implement of the paper: Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Underst…
☆13Dec 8, 2024Updated last year
EchoPluto / MagicID
View on GitHub
☆35Mar 18, 2025Updated last year
hanbyel0105 / Diff-HMR
View on GitHub
Official PyTorch Implementation of "Generative Approach for Probabilistic Human Mesh Recovery using Diffusion Models", ICCV 2023 CV4Metav…
☆27Oct 3, 2023Updated 2 years ago
antonioo-c / Diptych-Prompting
View on GitHub
Unofficial implementation of 'Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator'
☆10Dec 10, 2024Updated last year
IVRL / signal-leak-bias
View on GitHub
Official implementation of "Exploiting the Signal-Leak Bias in Diffusion Models" (WACV 2024)
☆20Jul 10, 2026Updated last week
xuboshen / EgoNCEpp
View on GitHub
[ICLR'25] Do Egocentric Video-Language Models Truly Understand Hand-Object Interactions?
☆13Apr 11, 2025Updated last year
FireRedTeam / LayerDiffuse-Flux
View on GitHub
☆246May 9, 2025Updated last year
Vchitect / Uni-MMMU
View on GitHub
[ACL2026 oral] Uni-MMMU : A Massive Multi-discipline Multimodal Unified Benchmark
☆25Apr 13, 2026Updated 3 months ago
KimDonghwan06 / PARTE_RELEASE
View on GitHub
[ICCV 2025] This repo is an official PyTorch implementation of PARTE: Part-Guided Texturing for 3D Human Reconstruction from a Single Ima…
☆17Sep 19, 2025Updated 10 months ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
animate-your-word / demo
View on GitHub
☆12Sep 28, 2025Updated 9 months ago
yuvraj108c / Codeformer-Tensorrt
View on GitHub
Codeformer Tensorrt Face Restoration
☆13Apr 15, 2024Updated 2 years ago
ZhenglinZhou / Zero-1-to-A
View on GitHub
[CVPR 2025] Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion
☆43Mar 21, 2025Updated last year
YingqingHe / Awesome-LLMs-meet-Multimodal-Generation
View on GitHub
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
☆549Apr 4, 2025Updated last year
zhaoshitian / LeX-Art
View on GitHub
Official Implementation of "LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis"
☆85Aug 25, 2025Updated 10 months ago
wangqiang9 / Awesome-Controllable-Video-Diffusion
View on GitHub
Awesome Controllable Video Generation with Diffusion Models
☆58Jul 22, 2025Updated last year
ZhenglinZhou / DreamDPO
View on GitHub
[ICML 2025] DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization
☆22May 24, 2025Updated last year
jschoormans / cog-comfyui-interior
View on GitHub
ComfyUI workflow for interior remodelling on Replicate
☆12Sep 13, 2024Updated last year
YaoXingbo / MagicCity
View on GitHub
ICCV 2025
☆16Mar 26, 2026Updated 3 months ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jingyu198 / Hyper3D
View on GitHub
☆17Jun 29, 2026Updated 3 weeks ago
PKU-YuanGroup / ImgEdit
View on GitHub
[NeurIPS 2025 D&B🔥] ImgEdit: A Unified Image Editing Dataset and Benchmark
☆327Nov 5, 2025Updated 8 months ago
Xiaojiu-z / SSR_Encoder
View on GitHub
Pytorch Implementation of "SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation"(CVPR 2024)
☆128Jul 22, 2024Updated 2 years ago
GeunminHwang / DiffuseSlide
View on GitHub
Official implementation of DiffuseSlide
☆16Jun 30, 2025Updated last year
blepping / comfyui_overly_complicated_sampling
View on GitHub
Wildly unsound and experimental sampling for ComfyUI
☆30Jul 11, 2026Updated last week
longrongyang / STGC
View on GitHub
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
☆13Feb 11, 2025Updated last year
linzhiqiu / CLIP-FlanT5
View on GitHub
Training code for CLIP-FlanT5
☆31Jul 29, 2024Updated last year