Vchitect / Evaluation-Agent
Evaluate Image/Video Generation like Humans - Fast, Explainable, Flexible
☆57Updated 3 months ago
Alternatives and similar repositories for Evaluation-Agent:
Users that are interested in Evaluation-Agent are comparing it to the libraries listed below
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆43Updated 2 months ago
- [CVPR2025] PAR: Parallelized Autoregressive Visual Generation. https://epiphqny.github.io/PAR-project/☆126Updated 2 months ago
- ☆22Updated 2 months ago
- [NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"☆130Updated 5 months ago
- EditWorld: Simulating World Dynamics for Instruction-Following Image Editing☆127Updated 8 months ago
- [ICLR 2024] LLM-grounded Video Diffusion Models (LVD): official implementation for the LVD paper☆146Updated 10 months ago
- Code release for our NeurIPS 2024 Spotlight paper "GenArtist: Multimodal LLM as an Agent for Unified Image Generation and Editing"☆111Updated 4 months ago
- The codes of Siggraph Asia 2024 paper "Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation"☆49Updated last month
- Official code for 'Paragraph-to-Image Generation with Information-Enriched Diffusion Model'☆102Updated 3 months ago
- Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics☆87Updated last week
- Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translati…☆206Updated 2 months ago
- [AAAI 2025] Official pytorch implementation of "VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion …☆158Updated 11 months ago
- ☆191Updated last month
- [ICLR 2025] HQ-Edit: A High-Quality and High-Coverage Dataset for General Image Editing☆92Updated 11 months ago
- [ICLR2025]☆138Updated last month
- This is the official implementation of SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation.☆99Updated 3 months ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆57Updated 3 weeks ago
- 🏞️ Official implementation of "Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition"☆104Updated 10 months ago
- Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope…☆247Updated last week
- [NeurIPS 2024] The official implement of research paper "FreeLong : Training-Free Long Video Generation with SpectralBlend Temporal Atten…☆39Updated last month
- [CVPR 2025🔥] Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model☆130Updated 3 weeks ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 7 months ago
- Empowering Unified MLLM with Multi-granular Visual Generation☆118Updated 2 months ago
- [NeurIPS 2024 D&B Track] Official Repo for "LVD-2M: A Long-take Video Dataset with Temporally Dense Captions"☆48Updated 5 months ago
- Interactive Video Generation via Masked-Diffusion☆78Updated 11 months ago
- InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions☆129Updated last year
- [ECCV2024] PartCraft: Crafting Creative Objects by Parts☆89Updated 2 months ago