wskbest / MFC-BenchLinks
☆12Updated last year
Alternatives and similar repositories for MFC-Bench
Users that are interested in MFC-Bench are comparing it to the libraries listed below
Sorting:
- LMM solved catastrophic forgetting, AAAI2025☆44Updated 8 months ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Updated last year
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆45Updated 2 years ago
- ☆35Updated 3 weeks ago
- Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"☆87Updated 2 years ago
- [NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…☆57Updated last year
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆47Updated 2 years ago
- (ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆44Updated 6 months ago
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆74Updated last year
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆59Updated last year
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆60Updated last year
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆57Updated 2 years ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆47Updated last year
- [IJCV 2025] Code for DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection☆58Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆44Updated last year
- ☆37Updated 2 years ago
- A framework that allows you to apply Sparse AutoEncoder on any models☆49Updated 5 months ago
- Video dataset dedicated to portrait-mode video recognition.☆55Updated 2 months ago
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆48Updated last year
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- ☆41Updated last year
- [ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…☆109Updated 5 months ago
- ☆27Updated last year
- LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation☆134Updated 2 years ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆47Updated 10 months ago
- Code for Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? [COLM 2024]☆24Updated last year
- ☆22Updated 2 years ago
- ☆55Updated last year
- [ACL 2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information☆15Updated last year
- An automatic MLLM hallucination detection framework☆19Updated 2 years ago