Vchitect / Evaluation-Agent
Evaluate Image/Video Generation like Humans - Fast, Explainable, Flexible
☆61Updated last week
Alternatives and similar repositories for Evaluation-Agent:
Users that are interested in Evaluation-Agent are comparing it to the libraries listed below
- Code release for our NeurIPS 2024 Spotlight paper "GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing"☆119Updated 6 months ago
- Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translati…☆213Updated 3 months ago
- Official PyTorch implementation of TokenSet.☆114Updated last month
- [IJCV 2025] Paragraph-to-Image Generation with Information-Enriched Diffusion Model☆103Updated last month
- [NeurIPS 2024] VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models☆143Updated 7 months ago
- [NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"☆130Updated 6 months ago
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆50Updated 3 months ago
- Official repository of "GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing"☆230Updated last month
- EditWorld: Simulating World Dynamics for Instruction-Following Image Editing☆129Updated 10 months ago
- ☆47Updated 4 months ago
- ☆65Updated last month
- [CVPR2025 Highlight] PAR: Parallelized Autoregressive Visual Generation. https://yuqingwang1029.github.io/PAR-project☆151Updated last month
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 8 months ago
- UniDisc: A discrete diffusion model for joint multimodal generation, enabling controllable and efficient text-image synthesis, editing, a…☆90Updated 3 weeks ago
- This is the official implementation of SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation.☆102Updated 5 months ago
- Empowering Unified MLLM with Multi-granular Visual Generation☆119Updated 3 months ago
- Inference-time scaling of diffusion-based image and video generation models.☆138Updated last month
- [NeurIPS 2024] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis☆65Updated 2 months ago
- Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think!☆105Updated last month
- Pytorch implementation for the paper titled "SimpleAR: Pushing the Frontier of Autoregressive Visual Generation"☆240Updated this week
- [ICLR 2025] HQ-Edit: A High-Quality and High-Coverage Dataset for General Image Editing☆97Updated last year
- The official implementation of paper: DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting☆119Updated 3 months ago
- The codes of Siggraph Asia 2024 paper "Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation"☆53Updated 2 months ago
- ☆193Updated 2 months ago
- Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"☆134Updated this week
- I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models☆146Updated 2 months ago
- [ICLR 2025] VideoGrain: This repo is the official implementation of "VideoGrain: Modulating Space-Time Attention for Multi-Grained Video …☆118Updated last month
- [AAAI 2025] Official pytorch implementation of "VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion …☆158Updated last year
- Official Implementation of Video-T1: Test-Time Scaling for Video Generation☆252Updated 3 weeks ago
- ☆126Updated 3 months ago