sjz5202 / LLaVA-RewardView external linksLinks
Official repository for LLaVA-Reward (ICCV 2025): Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
☆23Jul 30, 2025Updated 6 months ago
Alternatives and similar repositories for LLaVA-Reward
Users that are interested in LLaVA-Reward are comparing it to the libraries listed below
Sorting:
- ☆18Oct 23, 2024Updated last year
- [ICLR 2025] Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception☆14Jul 4, 2025Updated 7 months ago
- Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)☆10Jun 15, 2024Updated last year
- Reward Guided Latent Consistency Distillation☆26Oct 9, 2024Updated last year
- ☆11Oct 2, 2024Updated last year
- ☆13Jul 10, 2024Updated last year
- Official Implementations "Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference" for DiT (NeurIPS'24)☆15Aug 3, 2025Updated 6 months ago
- Official Implementation for "Transferring Unconditional to Conditional GANs with Hyper-Modulation" CVPRW 22 https://arxiv.org/abs/2112.02…☆13Jun 28, 2022Updated 3 years ago
- code for our BMVC 2021 paper "HCV: Hierarchy-Consistency Verification for Incremental Implicitly-Refined Classification"☆15Oct 28, 2022Updated 3 years ago
- ☆15Mar 30, 2025Updated 10 months ago
- PICABench: How Far Are We from Physically Realistic Image Editing?☆36Nov 5, 2025Updated 3 months ago
- ☆22May 11, 2025Updated 9 months ago
- [NeurIPS 2025] VideoRFT: Incentivizing Video Reasoning Capability in MLLMs via Reinforced Fine-Tuning☆64Jan 6, 2026Updated last month
- Official Implementations "Get What You Want, Not What You Don't: Image Content Suppression for Text-to-Image Diffusion Models" (ICLR2024)☆59Dec 3, 2024Updated last year
- ☆16Feb 23, 2025Updated 11 months ago
- [CVPR2025] Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation☆18May 2, 2025Updated 9 months ago
- ☆15Sep 18, 2023Updated 2 years ago
- [NeurIPS 2024] Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation☆70Oct 27, 2024Updated last year
- code for our paper "Attention Distillation: self-supervised vision transformer students need more guidance" in BMVC 2022☆17Oct 4, 2022Updated 3 years ago
- ☆47Apr 20, 2025Updated 9 months ago
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆39Jul 22, 2025Updated 6 months ago
- (TPAMI'2024) ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation☆22Aug 8, 2024Updated last year
- Image captioning with weight pruning in PyTorch☆22Jan 14, 2022Updated 4 years ago
- An innovative method designed to augment the capabilities of existing video diffusion models☆22May 10, 2024Updated last year
- Official implementation of "VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis"☆20Jan 26, 2025Updated last year
- [CVPR2025] Official Implementations "One-Way Ticket : Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models"☆28Jul 28, 2025Updated 6 months ago
- ☆27Mar 3, 2025Updated 11 months ago
- 【NeurIPS 2024】The official code of paper "Automated Multi-level Preference for MLLMs"☆21Sep 26, 2024Updated last year
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆24Aug 13, 2024Updated last year
- [ICIP 2025] Scribble-Guided Diffusion for Training-free Text-to-Image Generation☆24Oct 2, 2024Updated last year
- Repository for ECCV 2022 paper "Source-free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition"☆24Mar 9, 2023Updated 2 years ago
- [CVPR 2025] GPS as a Control Signal for Image Generation☆25Mar 18, 2025Updated 10 months ago
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆57Jul 25, 2023Updated 2 years ago
- Unsupervised Domain Adaptation without Source Data by Casting a BAIT☆23Sep 18, 2022Updated 3 years ago
- ☆31Sep 1, 2025Updated 5 months ago
- Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening☆69May 18, 2025Updated 8 months ago
- Data and sample evaluation codes for Multimodal Rewardbench 2☆136Dec 20, 2025Updated last month
- [NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…☆57Mar 4, 2024Updated last year
- [NeurIPS 2023] Free-Bloom: Zero-Shot Text-to-Video Generator with LLM Director and LDM Animator☆98Mar 18, 2024Updated last year