wskbest / MFC-BenchLinks
☆12Updated last year
Alternatives and similar repositories for MFC-Bench
Users that are interested in MFC-Bench are comparing it to the libraries listed below
Sorting:
- (ICLR 2025 Spotlight) DEEM: Official implementation of Diffusion models serve as the eyes of large language models for image perception.☆40Updated 3 months ago
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆19Updated last year
- [NeurIPS 2023 Datasets and Benchmarks] "FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation", Yuanxin L…☆56Updated last year
- The official code for paper "EasyGen: Easing Multimodal Generation with a Bidirectional Conditional Diffusion Model and LLMs"☆73Updated 10 months ago
- VPEval Codebase from Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆44Updated last year
- [ACL 2024] Multi-modal preference alignment remedies regression of visual instruction tuning on language model☆47Updated 11 months ago
- [2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation☆46Updated 2 years ago
- code for "Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization"☆59Updated last year
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆46Updated 7 months ago
- Official code repo for "Editing Implicit Assumptions in Text-to-Image Diffusion Models"☆87Updated 2 years ago
- Mixture of Attention Heads☆49Updated 3 years ago
- LMM solved catastrophic forgetting, AAAI2025☆44Updated 6 months ago
- ☆27Updated last year
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- [ACM MM 2022 Oral] This is the official implementation of "SER30K: A Large-Scale Dataset for Sticker Emotion Recognition"☆26Updated 3 years ago
- [EMNLP'23 Oral] ReSee: Responding through Seeing Fine-grained Visual Knowledge in Open-domain Dialogue PyTorch Implementation☆12Updated last year
- Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models☆44Updated last year
- ☆21Updated 2 years ago
- [NAACL 2024] Vision language model that reduces hallucinations through self-feedback guided revision. Visualizes attentions on image feat…☆46Updated last year
- [NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation☆13Updated 2 years ago
- [ICCV 2025] The official code of the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration R…☆106Updated 3 months ago
- Official Repository of Personalized Visual Instruct Tuning☆32Updated 7 months ago
- ☆55Updated last year
- A unified framework for controllable caption generation across images, videos, and audio. Supports multi-modal inputs and customizable ca…☆51Updated 2 months ago
- Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024…☆56Updated 11 months ago
- Visual Programming for Text-to-Image Generation and Evaluation (NeurIPS 2023)☆56Updated 2 years ago
- [ICLR 2024] This is the repository for the paper titled "DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning"☆96Updated last year
- Code for our Paper "All in an Aggregated Image for In-Image Learning"☆29Updated last year
- Implementation and dataset for paper "Can MLLMs Perform Text-to-Image In-Context Learning?"☆41Updated 4 months ago
- Official repo for StableLLAVA☆94Updated last year