wskbest / MFC-BenchLinks
☆12Updated last year
Alternatives and similar repositories for MFC-Bench
Users that are interested in MFC-Bench are comparing it to the libraries listed below
Sorting:
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Updated 2 years ago
- [NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…☆57Updated last year
- ☆40Updated last month
- ☆37Updated 2 years ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Updated last year
- ORES: Open-vocabulary Responsible Visual Synthesis☆14Updated 2 years ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated 2 years ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated last year
- ☆27Updated 2 years ago
- An automatic MLLM hallucination detection framework☆19Updated 2 years ago
- (ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆48Updated 7 months ago
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆57Updated 2 years ago
- ☆22Updated 2 years ago
- UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model☆22Updated last year
- Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"☆87Updated 2 years ago
- A framework that allows you to apply Sparse AutoEncoder on any models☆51Updated 7 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆48Updated 11 months ago
- Implementation and dataset for paper "Can MLLMs Perform Text-to-Image In-Context Learning?"☆42Updated 8 months ago
- Code for our Paper "All in an Aggregated Image for In-Image Learning"☆29Updated last year
- LMM solved catastrophic forgetting, AAAI2025☆45Updated 9 months ago
- Official Repository of Personalized Visual Instruct Tuning☆34Updated 11 months ago
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆47Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆45Updated last year
- ☆55Updated last year
- A unified framework for controllable caption generation across images, videos, and audio. Supports multi-modal inputs and customizable ca…☆52Updated 6 months ago
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆19Updated last year
- Syphus: Automatic Instruction-Response Generation Pipeline☆14Updated 2 years ago
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆60Updated last year
- DDS: Delta Denoising Score PyTorch implementation☆19Updated 2 years ago
- Code and data for EMNLP 2023 paper "Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans?"☆15Updated 2 years ago